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PREFACE 


This book is designed to support a one-semester course in numerical methods. It has been 
written for students who want to learn and apply numerical methods in order to solve prob- 
lems in engineering and science. As such, the methods are motivated by problems rather 
than by mathematics. That said, sufficient theory is provided so that students come away 
with insight into the techniques and their shortcomings. 

MATLAB” provides a great environment for such a course. Although other en- 
vironments (e.g., Excel/VBA, Mathcad) or languages (e.g., Fortran 90, C++) could 
have been chosen, MATLAB presently offers a nice combination of handy program- 
ming features with powerful built-in numerical capabilities. On the one hand, its 
M-file programming environment allows students to implement moderately compli- 
cated algorithms in a structured and coherent fashion. On the other hand, its built-in, 
numerical capabilities empower students to solve more difficult problems without try- 
ing to “reinvent the wheel.” 

The basic content, organization, and pedagogy of the third edition are essentially pre- 
served in the fourth edition. In particular, the conversational writing style is intentionally 
maintained in order to make the book easier to read. This book tries to speak directly to the 
reader and is designed in part to be a tool for self-teaching. 

That said, this edition differs from the past edition in three major ways: (1) new 
material, (2) new and revised homework problems, and (3) an appendix introducing 
Simulink. 


1. New Content. I have included new and enhanced sections on a number of topics. The 
primary additions include material on some MATLAB functions not included in previ- 
ous editions (e.g., fsolve, integrate, bvp4c), some new applications of Monte Carlo 
for problems such as integration and optimization, and MATLAB’s new way to pass 
parameters to function functions. 

2. New Homework Problems. Most of the end-of-chapter problems have been modified, 
and a variety of new problems have been added. In particular, an effort has been made 
to include several new problems for each chapter that are more challenging and dif- 
ficult than the problems in the previous edition. 

3. Ihave developed a short primer on Simulink which I have my students read prior to 
covering that topic. Although I recognize that some professors may not choose to 
cover Simulink, I included it as a teaching aid for those that do. 


PREFACE xv 


Aside from the new material and problems, the fourth edition is very similar to the 
third. In particular, I have endeavored to maintain most of the features contributing to its 
pedagogical effectiveness including extensive use of worked examples and engineering and 
scientific applications. As with the previous edition, I have made a concerted effort to make 
this book as “‘student-friendly” as possible. Thus, I’ve tried to keep my explanations straight- 
forward and practical. 

Although my primary intent is to empower students by providing them with a sound 
introduction to numerical problem solving, I have the ancillary objective of making this 
introduction exciting and pleasurable. I believe that motivated students who enjoy engi- 
neering and science, problem solving, mathematics—and yes—programming, will ulti- 
mately make better professionals. If my book fosters enthusiasm and appreciation for these 
subjects, I will consider the effort a success. 


Acknowledgments. Several members of the McGraw-Hill team have contributed to 
this project. Special thanks are due to Jolynn Kilburg, Thomas Scaife, Ph.D., Chelsea 
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creative projects such as this book dealing with computing and engineering. In addition, 
my colleagues in the School of Engineering at Tufts, notably Masoud Sanayei, Babak 
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supportive and helpful. 
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PEDAGOGICAL TOOLS 


Theory Presented as It Informs Key Concepts. The text is intended for Numerical 
Methods users, not developers. Therefore, theory is not included for “‘theory’s sake,” for ex- 
ample no proofs. Theory is included as it informs key concepts such as the Taylor series, con- 
vergence, condition, etc. Hence, the student is shown how the theory connects with practical 
issues in problem solving. 


Introductory MATLAB Material. The text includes two introductory chapters on how to 
use MATLAB. Chapter 2 shows students how to perform computations and create graphs 
in MATLAB’s standard command mode. Chapter 3 provides a primer on developing 
numerical programs via MATLAB M-file functions. Thus, the text provides students with 
the means to develop their own numerical algorithms as well as to tap into MATLAB’s 
powerful built-in routines. 


Algorithms Presented Using MATLAB M-files. Instead of using pseudocode, this book 
presents algorithms as well-structured MATLAB M-files. Aside from being useful com- 
puter programs, these provide students with models for their own M-files that they will 
develop as homework exercises. 


Worked Examples and Case Studies. Extensive worked examples are laid out in detail 
so that students can clearly follow the steps in each numerical computation. The case stud- 
ies consist of engineering and science applications which are more complex and richer than 
the worked examples. They are placed at the ends of selected chapters with the intention 
of (1) illustrating the nuances of the methods and (2) showing more realistically how the 
methods along with MATLAB are applied for problem solving. 


Problem Sets. The text includes a wide variety of problems. Many are drawn from en- 
gineering and scientific disciplines. Others are used to illustrate numerical techniques and 
theoretical concepts. Problems include those that can be solved with a pocket calculator as 
well as others that require computer solution with MATLAB. 


Useful Appendices andindexes. Appendix A contains MATLAB commands, Appendix 
B contains M-file functions, and new Appendix C contains a brief Simulink primer. 


Instructor Resources. Solutions Manual, Lecture PowerPoints, Text images in Power- 
Point, M-files and additional MATLAB resources are available through Connect®. 


1.1 


Modeling, Computers, 
and Error Analysis 


MOTIVATION 


What are numerical methods and why should you study them? 

Numerical methods are techniques by which mathematical problems are formulated 
so that they can be solved with arithmetic and logical operations. Because digital comput- 
ers excel at performing such operations, numerical methods are sometimes referred to as 
computer mathematics. 

In the pre-computer era, the time and drudgery of implementing such calculations 
seriously limited their practical use. However, with the advent of fast, inexpensive digital 
computers, the role of numerical methods in engineering and scientific problem solving 
has exploded. Because they figure so prominently in much of our work, I believe that 
numerical methods should be a part of every engineer’s and scientist’s basic education. 
Just as we all must have solid foundations in the other areas of mathematics and science, 
we should also have a fundamental understanding of numerical methods. In particular, we 
should have a solid appreciation of both 
their capabilities and their limitations. 

Beyond contributing to your overall 
education, there are several additional 
reasons why you should study numerical 
methods: 


1. Numerical methods greatly expand the 
types of problems you can address. 
They are capable of handling large sys- 
tems of equations, nonlinearities, and 
complicated geometries that are not 
uncommon in engineering and science 
and that are often impossible to solve 
analytically with standard calculus. As 
such, they greatly enhance your prob- 
lem-solving skills. 

2. Numerical methods allow you to use 
“canned” software with insight. During 
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1.2 


your career, you will invariably have occasion to use commercially available prepack- 
aged computer programs that involve numerical methods. The intelligent use of these 
programs is greatly enhanced by an understanding of the basic theory underlying the 
methods. In the absence of such understanding, you will be left to treat such packages 
as “black boxes” with little critical insight into their inner workings or the validity of 
the results they produce. 

3. Many problems cannot be approached using canned programs. If you are conversant 
with numerical methods, and are adept at computer programming, you can design 
your own programs to solve problems without having to buy or commission expensive 
software. 

4. Numerical methods are an efficient vehicle for learning to use computers. Because nu- 
merical methods are expressly designed for computer implementation, they are ideal for 
illustrating the computer’s powers and limitations. When you successfully implement 
numerical methods on a computer, and then apply them to solve otherwise intractable 
problems, you will be provided with a dramatic demonstration of how computers can 
serve your professional development. At the same time, you will also learn to acknowl- 
edge and control the errors of approximation that are part and parcel of large-scale 
numerical calculations. 

5. Numerical methods provide a vehicle for you to reinforce your understanding of math- 
ematics. Because one function of numerical methods is to reduce higher mathematics 
to basic arithmetic operations, they get at the “nuts and bolts” of some otherwise 
obscure topics. Enhanced understanding and insight can result from this alternative 
perspective. 


With these reasons as motivation, we can now set out to understand how numerical 
methods and digital computers work in tandem to generate reliable solutions to mathemati- 
cal problems. The remainder of this book is devoted to this task. 


PART ORGANIZATION 


This book is divided into six parts. The latter five parts focus on the major areas of nu- 
merical methods. Although it might be tempting to jump right into this material, Part One 
consists of four chapters dealing with essential background material. 

Chapter I provides a concrete example of how a numerical method can be employed 
to solve a real problem. To do this, we develop a mathematical model of a free-falling 
bungee jumper. The model, which is based on Newton’s second law, results in an ordinary 
differential equation. After first using calculus to develop a closed-form solution, we then 
show how a comparable solution can be generated with a simple numerical method. We 
end the chapter with an overview of the major areas of numerical methods that we cover in 
Parts Two through Six. 

Chapters 2 and 3 provide an introduction to the MATLAB® software environment. 
Chapter 2 deals with the standard way of operating MATLAB by entering commands one 
at a time in the so-called calculator, or command, mode. This interactive mode provides 
a straightforward means to orient you to the environment and illustrates how it is used for 
common operations such as performing calculations and creating plots. 


1.2 PART ORGANIZATION 3 


Chapter 3 shows how MATLAB’s programming mode provides a vehicle for assem- 
bling individual commands into algorithms. Thus, our intent is to illustrate how MATLAB 
serves as a convenient programming environment to develop your own software. 

Chapter 4 deals with the important topic of error analysis, which must be understood 
for the effective use of numerical methods. The first part of the chapter focuses on the 
roundoff errors that result because digital computers cannot represent some quantities 
exactly. The latter part addresses truncation errors that arise from using an approximation 
in place of an exact mathematical procedure. 


Mathematical Modeling, 
Numerical Methods, 
and Problem Solving 


CHAPTER OBJECTIVES 


The primary objective of this chapter is to provide you with a concrete idea of what 
numerical methods are and how they relate to engineering and scientific problem 
solving. Specific objectives and topics covered are 


Learning how mathematical models can be formulated on the basis of scientific 
principles to simulate the behavior of a simple physical system. 


Understanding how numerical methods afford a means to generate solutions in a 
manner that can be implemented on a digital computer. 

Understanding the different types of conservation laws that lie beneath the models 
used in the various engineering disciplines and appreciating the difference 
between steady-state and dynamic solutions of these models. 

Learning about the different types of numerical methods we will cover in this 
book. 


YOU’VE GOT A PROBLEM 


uppose that a bungee-jumping company hires you. You’re given the task of 

S predicting the velocity of a jumper (Fig. 1.1) as a function of time during the 

free-fall part of the jump. This information will be used as part of a larger 

analysis to determine the length and required strength of the bungee cord for jumpers 
of different mass. 

You know from your studies of physics that the acceleration should be equal to the ratio 

of the force to the mass (Newton’s second law). Based on this insight and your knowledge 
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Upward force 
due to air 
resistance 


Downward 
force due 
to gravity 


FIGURE 1.1 
Forces acting 
on a free-falling 
bungee jumper. 
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of physics and fluid mechanics, you develop the following mathematical model for the rate 
of change of velocity with respect to time, 


where v = downward vertical velocity (m/s), t = time (s), g = the acceleration due to 
gravity (9.81 m/s’), c; = a lumped drag coefficient (kg/m), and m = the jumper’s 
mass (kg). The drag coefficient is called “lumped” because its magnitude depends on fac- 
tors such as the jumper’s area and the fluid density (see Sec. 1.4). 

Because this is a differential equation, you know that calculus might be used to obtain 
an analytical or exact solution for v as a function of t. However, in the following pages, we 
will illustrate an alternative solution approach. This will involve developing a computer- 
oriented numerical or approximate solution. 

Aside from showing you how the computer can be used to solve this particular prob- 
lem, our more general objective will be to illustrate (a) what numerical methods are and 
(b) how they figure in engineering and scientific problem solving. In so doing, we will also 
show how mathematical models figure prominently in the way engineers and scientists use 
numerical methods in their work. 


A SIMPLE MATHEMATICAL MODEL 


A mathematical model can be broadly defined as a formulation or equation that expresses 
the essential features of a physical system or process in mathematical terms. In a very gen- 
eral sense, it can be represented as a functional relationship of the form 


Dependent _ 


independent forcing 
variable ` f 


. arameters ; 
variables ’ P > functions 


(1.1) 


where the dependent variable is a characteristic that typically reflects the behavior or state 
of the system; the independent variables are usually dimensions, such as time and space, 
along which the system’s behavior is being determined; the parameters are reflective of 
the system’s properties or composition; and the forcing functions are external influences 
acting upon it. 

The actual mathematical expression of Eq. (1.1) can range from a simple algebraic 
relationship to large complicated sets of differential equations. For example, on the basis 
of his observations, Newton formulated his second law of motion, which states that the 
time rate of change of momentum of a body is equal to the resultant force acting on it. The 
mathematical expression, or model, of the second law is the well-known equation 


F=ma (1.2) 


where F is the net force acting on the body (N, or kg m/s”), m is the mass of the object (kg), 
and a is its acceleration (m/s”). 


MATHEMATICAL MODELING, NUMERICAL METHODS, AND PROBLEM SOLVING 


The second law can be recast in the format of Eq. (1.1) by merely dividing both sides 
by m to give 
where a is the dependent variable reflecting the system’s behavior, F is the forcing func- 
tion, and m is a parameter. Note that for this simple case there is no independent variable 
because we are not yet predicting how acceleration varies in time or space. 


Equation (1.3) has a number of characteristics that are typical of mathematical models 
of the physical world. 


It describes a natural process or system in mathematical terms. 
It represents an idealization and simplification of reality. That is, the model ignores 
negligible details of the natural process and focuses on its essential manifestations. 
Thus, the second law does not include the effects of relativity that are of minimal 
importance when applied to objects and forces that interact on or about the earth’s 
surface at velocities and on scales visible to humans. 

e Finally, it yields reproducible results and, consequently, can be used for predictive 
purposes. For example, if the force on an object and its mass are known, Eq. (1.3) can 
be used to compute acceleration. 


Because of its simple algebraic form, the solution of Eq. (1.2) was obtained easily. 
However, other mathematical models of physical phenomena may be much more complex, 
and either cannot be solved exactly or require more sophisticated mathematical techniques 
than simple algebra for their solution. To illustrate a more complex model of this kind, 
Newton’s second law can be used to determine the terminal velocity of a free-falling body 
near the earth’s surface. Our falling body will be a bungee jumper (Fig. 1.1). For this case, 
a model can be derived by expressing the acceleration as the time rate of change of the 
velocity (dv/dt) and substituting it into Eq. (1.3) to yield 

dv _ F 

adm (1.4) 
where v is velocity (in meters per second). Thus, the rate of change of the velocity is equal 
to the net force acting on the body normalized to its mass. If the net force is positive, the 
object will accelerate. If it is negative, the object will decelerate. If the net force is zero, the 
object’s velocity will remain at a constant level. 

Next, we will express the net force in terms of measurable variables and parameters. 
For a body falling within the vicinity of the earth, the net force is composed of two 
opposing forces: the downward pull of gravity F and the upward force of air resistance Fy 
(Fig. 1.1): 


F=Fy+Fy (1.5) 


If force in the downward direction is assigned a positive sign, the second law can be 
used to formulate the force due to gravity as 


Fp = mg (1.6) 


where g is the acceleration due to gravity (9.81 m/s’). 
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EXAMPLE 1.1 


Air resistance can be formulated in a variety of ways. Knowledge from the science 
of fluid mechanics suggests that a good first approximation would be to assume that it is 
proportional to the square of the velocity, 


Fy =—-c4v" (1.7) 


where c, is a proportionality constant called the lumped drag coefficient (kg/m). Thus, the 
greater the fall velocity, the greater the upward force due to air resistance. The parameter 
c, accounts for properties of the falling object, such as shape or surface roughness, that af- 
fect air resistance. For the present case, c might be a function of the type of clothing or the 
orientation used by the jumper during free fall. 

The net force is the difference between the downward and upward force. Therefore, 
Eqs. (1.4) through (1.7) can be combined to yield 


A =g- “ap (1.8) 

Equation (1.8) is a model that relates the acceleration of a falling object to the forces 
acting on it. It is a differential equation because it is written in terms of the differential rate 
of change (dv/dt) of the variable that we are interested in predicting. However, in contrast 
to the solution of Newton’s second law in Eq. (1.3), the exact solution of Eq. (1.8) for the 
velocity of the jumper cannot be obtained using simple algebraic manipulation. Rather, 
more advanced techniques such as those of calculus must be applied to obtain an exact or 
analytical solution. For example, if the jumper is initially at rest (v = 0 at t = 0), calculus 
can be used to solve Eq. (1.8) for 


v(t) = 4/ m tanhl 4/2 EZA (1.9) 


where tanh is the hyperbolic tangent that can be either computed directly! or via the more 
elementary exponential function as in 


X =x 
de 
tanh x = = 


—_—— 1.10 
e+e ( ) 


Note that Eq. (1.9) is cast in the general form of Eq. (1.1) where v(t) is the dependent 
variable, tis the independent variable, c; and m are parameters, and g is the forcing function. 


Analytical Solution to the Bungee Jumper Problem 


Problem Statement. A bungee jumper with a mass of 68.1 kg leaps from a stationary 
hot air balloon. Use Eq. (1.9) to compute velocity for the first 12 s of free fall. Also deter- 
mine the terminal velocity that will be attained for an infinitely long cord (or alternatively, 
the jumpmaster is having a particularly bad day!). Use a drag coefficient of 0.25 kg/m. 


' MATLAB allows direct calculation of the hyperbolic tangent via the built-in function tanh(x). 
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Solution. Inserting the parameters into Eq. (1.9) yields 


9.81(68.1) 


0) = V—995 


which can be used to compute 


ts v, m/s 


0 


grr 
NOARDARNO 


18. 
33. 
42. 
46. 
49. 
50. 
5T; 


7292 
1118 
0762 
9575 
4214 
6175 
6938 


tanh | 281029) | = 51.6938 tanh(0. 189777) 


According to the model, the jumper accelerates rapidly (Fig. 1.2). A velocity of 
49.4214 m/s (about 110 mi/hr) is attained after 10 s. Note also that after a sufficiently 


FIGURE 1.2 


The analytical solution for the bungee jumper problem as computed in Example 1.1. Velocity 
increases with time and asymptotically approaches a terminal velocity. 


60 


Terminal velocity 


v, m/s 


nS 
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long time, a constant velocity, called the terminal velocity, of 51.6983 m/s (115.6 mi/hr) 
is reached. This velocity is constant because, eventually, the force of gravity will be in 
balance with the air resistance. Thus, the net force is zero and acceleration has ceased. 


Equation (1.9) is called an analytical or closed-form solution because it exactly satis- 
fies the original differential equation. Unfortunately, there are many mathematical models 
that cannot be solved exactly. In many of these cases, the only alternative is to develop a 
numerical solution that approximates the exact solution. 

Numerical methods are those in which the mathematical problem is reformulated so it 
can be solved by arithmetic operations. This can be illustrated for Eq. (1.8) by realizing that 
the time rate of change of velocity can be approximated by (Fig. 1.3): 


dv ~ Av _ v(ti41) — v(t) 
a” At tuth 


(1.11) 


where Av and At are differences in velocity and time computed over finite intervals, 
v(t;) is velocity at an initial time f,, and v(¢,,,) is velocity at some later time f,,,. Note that 
dov/dt = Av/At is approximate because At is finite. Remember from calculus that 


dv _ Jim Av 
dt Ar0 At 


Equation (1.11) represents the reverse process. 


FIGURE 1.3 
The use of a finite difference to approximate the first derivative of v with respect to t. 
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Equation (1.11) is called a finite-difference approximation of the derivative at time f,. 
It can be substituted into Eq. (1.8) to give 

V(t) — v(t) Ca 2 

tiny — = 8 = mM) 
This equation can then be rearranged to yield 
2 Ca 2 

Vlt) = v(t) + [8 — Vt)" | Gis — 4) (1.12) 

Notice that the term in brackets is the right-hand side of the differential equation itself 
[Eq. (1.8)]. That is, it provides a means to compute the rate of change or slope of v. Thus, 
the equation can be rewritten more concisely as 

dv; 

Vist =O +- At 0:13) 
where the nomenclature v, designates velocity at time f,, and At = t,,, — t; 

We can now see that the differential equation has been transformed into an equation that 
can be used to determine the velocity algebraically at f,, using the slope and previous values 
of v and ¢. If you are given an initial value for velocity at some time ¢,, you can easily compute 
velocity at a later time ¢,,,. This new value of velocity at ¢,, can in turn be employed to extend 
the computation to velocity at f,,, and so on. Thus at any time along the way, 

New value = old value + slope x step size 
This approach is formally called Euler’s method. We’ll discuss it in more detail when we 
turn to differential equations later in this book. 

EXAMPLE 1.2 Numerical Solution to the Bungee Jumper Problem 


Problem Statement. Perform the same computation as in Example 1.1 but use Eq. (1.12) 
to compute velocity with Euler’s method. Employ a step size of 2 s for the calculation. 


Solution. At the start of the computation (tọ = 0), the velocity of the jumper is zero. 


Using this information and the parameter values from Example 1.1, Eq. (1.12) can be used 
to compute velocity at ft, = 2 s: 


= _ 0.25 0} _ 
v=0+ [9.81 631 O) | x 2 = 19.62 m/s 


For the next interval (from t = 2 to 4 s), the computation is repeated, with the result 


v = 19.62 + [9.81 = 9:25 (19,62)| x 2 = 36.4137 m/s 
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Comparison of the numerical and analytical solutions for the bungee jumper problem. 


The calculation is continued in a similar fashion to obtain additional values: 


tis v, m/s 
0 0 
2 19.6200 
4 36.4137 
6 46.2983 
8 50.1802 
10 51.3123 
12 51.6008 
o0 51.6938 


The results are plotted in Fig. 1.4 along with the exact solution. We can see that the numeri- 
cal method captures the essential features of the exact solution. However, because we have 
employed straight-line segments to approximate a continuously curving function, there is 
some discrepancy between the two results. One way to minimize such discrepancies is to 
use a smaller step size. For example, applying Eq. (1.12) at 1-s intervals results in a smaller 
error, as the straight-line segments track closer to the true solution. Using hand calcula- 
tions, the effort associated with using smaller and smaller step sizes would make such 
numerical solutions impractical. However, with the aid of the computer, large numbers of 
calculations can be performed easily. Thus, you can accurately model the velocity of the 
jumper without having to solve the differential equation exactly. 


12 
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1.2 


As in Example 1.2, a computational price must be paid for a more accurate numeri- 
cal result. Each halving of the step size to attain more accuracy leads to a doubling of the 
number of computations. Thus, we see that there is a trade-off between accuracy and com- 
putational effort. Such trade-offs figure prominently in numerical methods and constitute 
an important theme of this book. 


CONSERVATION LAWS IN ENGINEERING AND SCIENCE 


Aside from Newton’s second law, there are other major organizing principles in science 
and engineering. Among the most important of these are the conservation laws. Although 
they form the basis for a variety of complicated and powerful mathematical models, the 
great conservation laws of science and engineering are conceptually easy to understand. 
They all boil down to 


Change = increases — decreases (1.14) 


This is precisely the format that we employed when using Newton’s law to develop a force 
balance for the bungee jumper [Eq. (1.8)]. 

Although simple, Eq. (1.14) embodies one of the most fundamental ways in which 
conservation laws are used in engineering and science—that is, to predict changes 
with respect to time. We will give it a special name—the time-variable (or transient) 
computation. 

Aside from predicting changes, another way in which conservation laws are applied is 
for cases where change is nonexistent. If change is zero, Eq. (1.14) becomes 


Change = 0 = increases — decreases 
or 
Increases = decreases (1.15) 


Thus, if no change occurs, the increases and decreases must be in balance. This case, which 
is also given a special name—the steady-state calculation—has many applications in engi- 
neering and science. For example, for steady-state incompressible fluid flow in pipes, the 
flow into a junction must be balanced by flow going out, as in 


Flow in = flow out 
For the junction in Fig. 1.5, the balance that can be used to compute that the flow out of the 
fourth pipe must be 60. 

For the bungee jumper, the steady-state condition would correspond to the case where 
the net force was zero or [Eq. (1.8) with dv/dt = 0] 

mg = cv" (1.16) 
Thus, at steady state, the downward and upward forces are in balance and Eq. (1.16) can 
be solved for the terminal velocity 


gm 


Cq 


Although Eqs. (1.14) and (1.15) might appear trivially simple, they embody the two funda- 
mental ways that conservation laws are employed in engineering and science. As such, they 
will form an important part of our efforts in subsequent chapters to illustrate the connection 
between numerical methods and engineering and science. 
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Pipe 2 
Flow in = 80 


ee n= °°? 
Flow in = 100 


Flow out = ? 


Pipe 3 
Flow out = 120 


FIGURE 1.5 
A flow balance for steady incompressible fluid flow at the junction of pipes. 


Table 1.1 summarizes some models and associated conservation laws that figure 
prominently in engineering. Many chemical engineering problems involve mass balances 
for reactors. The mass balance is derived from the conservation of mass. It specifies that 
the change of mass of a chemical in the reactor depends on the amount of mass flowing in 
minus the mass flowing out. 

Civil and mechanical engineers often focus on models developed from the conserva- 
tion of momentum. For civil engineering, force balances are utilized to analyze structures 
such as the simple truss in Table 1.1. The same principles are employed for the mechanical 
engineering case studies to analyze the transient up-and-down motion or vibrations of an 
automobile. 

Finally, electrical engineering studies employ both current and energy balances to model 
electric circuits. The current balance, which results from the conservation of charge, is simi- 
lar in spirit to the flow balance depicted in Fig. 1.5. Just as flow must balance at the junction 
of pipes, electric current must balance at the junction of electric wires. The energy balance 
specifies that the changes of voltage around any loop of the circuit must add up to zero. 

It should be noted that there are many other branches of engineering beyond chemical, 
civil, electrical, and mechanical. Many of these are related to the Big Four. For example, chem- 
ical engineering skills are used extensively in areas such as environmental, petroleum, and bio- 
medical engineering. Similarly, aerospace engineering has much in common with mechanical 
engineering. I will endeavor to include examples from these areas in the coming pages. 


NUMERICAL METHODS COVERED IN THIS BOOK 


Euler’s method was chosen for this introductory chapter because it is typical of many 
other classes of numerical methods. In essence, most consist of recasting mathematical 
operations into the simple kind of algebraic and logical operations compatible with digital 
computers. Figure 1.6 summarizes the major areas covered in this text. 
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TABLE 141.1 Devices and types of balances that are commonly used in the four major areas of engineering. For 
each case, the conservation law on which the balance is based is specified. 


Field Device Organizing Principle Mathematical Expression 
Chemical Conservation Mass balance: 
engineering of mass Input Output 
Over a unit of time period 
Amass = inputs — outputs 
Civil Conservation Force balance: +Fy 


engineering of momentum t 


Structure : -F, ~ e > +Fy 
A —Fy 


At each node 
È horizontal forces (Fp) = O 


X vertical forces (Fy) = O 
Mechanical Machine Conservation Force balance: Upward force 
engineering Ce of momentum peng 
BA Et 
Downward force 


2» 
m co = downward force — upward force 
Electrical Conservation Current balance: +i, ——» -i; 
engineering + of charge 
a For each node | 
X current (i) = O + 
Circuit 
iR; 
Conservation Voltage balance: 
of energy 
LR, 4 
iR; 


Around each loop 
x emf’s — x voltage drops for resistors 
=0 
2é-2XiR=0 
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(a) Part 2: Roots and optimization fo 


Roots: Solve for x so that f(x) = O 
Roots 


Optimization: Solve for x so that f'(x) = O 


x 
, s Optima 
(b) Part 3: Linear algebraic equations 
X9 
Given the a’s and the b’s, solve for the x’s 
Gapesete py PS : Solution 
aX; + AgyXy = by 
R = 
(c) Part 4: Curve fitting 
f(x) f~) Interpolation 
e 
Regression 
x x 
(d) Part 5: Integration and differentiation dy/dx 
yh 
Integration: Find the area under the curve 
Differentiation: Find the slope of the curve 
x 
(e) Part 6: Differential equations ya 
Given Slope = f(t; yi) 
dy _ Ay 
a Re Sed) o 
solve for y as a function of t i 
Yir = Yi + fltp yJAt i$ At a 


FIGURE 1.6 
Summary of the numerical methods covered in this book. 
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Part Two deals with two related topics: root finding and optimization. As depicted in 
Fig. 1.6a, root location involves searching for the zeros of a function. In contrast, optimiza- 
tion involves determining a value or values of an independent variable that correspond to a 
“best” or optimal value of a function. Thus, as in Fig. 1.6a, optimization involves identify- 
ing maxima and minima. Although somewhat different approaches are used, root location 
and optimization both typically arise in design contexts. 

Part Three is devoted to solving systems of simultaneous linear algebraic equations 
(Fig. 1.6b). Such systems are similar in spirit to roots of equations in the sense that they 
are concerned with values that satisfy equations. However, in contrast to satisfying a single 
equation, a set of values is sought that simultaneously satisfies a set of linear algebraic 
equations. Such equations arise in a variety of problem contexts and in all disciplines 
of engineering and science. In particular, they originate in the mathematical modeling of 
large systems of interconnected elements such as structures, electric circuits, and fluid 
networks. However, they are also encountered in other areas of numerical methods such as 
curve fitting and differential equations. 

AS an engineer or scientist, you will often have occasion to fit curves to data points. 
The techniques developed for this purpose can be divided into two general categories: re- 
gression and interpolation. As described in Part Four (Fig. 1.6c), regression is employed 
where there is a significant degree of error associated with the data. Experimental results 
are often of this kind. For these situations, the strategy is to derive a single curve that 
represents the general trend of the data without necessarily matching any individual 
points. 

In contrast, interpolation is used where the objective is to determine intermediate val- 
ues between relatively error-free data points. Such is usually the case for tabulated informa- 
tion. The strategy in such cases is to fit a curve directly through the data points and use the 
curve to predict the intermediate values. 

As depicted in Fig. 1.6d, Part Five is devoted to integration and differentiation. A 
physical interpretation of numerical integration is the determination of the area under a 
curve. Integration has many applications in engineering and science, ranging from the 
determination of the centroids of oddly shaped objects to the calculation of total quantities 
based on sets of discrete measurements. In addition, numerical integration formulas play 
an important role in the solution of differential equations. Part Five also covers methods 
for numerical differentiation. As you know from your study of calculus, this involves the 
determination of a function’s slope or its rate of change. 

Finally, Part Six focuses on the solution of ordinary differential equations (Fig. 1.6e). 
Such equations are of great significance in all areas of engineering and science. This is 
because many physical laws are couched in terms of the rate of change of a quantity rather 
than the magnitude of the quantity itself. Examples range from population-forecasting 
models (rate of change of population) to the acceleration of a falling body (rate of change 
of velocity). Two types of problems are addressed: initial-value and boundary-value 
problems. 
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1.4 CASE STUDY IT’S A REAL DRAG 


Background. In our model of the free-falling bungee jumper, we assumed that drag 
depends on the square of velocity (Eq. 1.7). A more detailed representation, which was 
originally formulated by Lord Rayleigh, can be written as 


k= -} pv’AC v0 (1.17) 


where F} = the drag force (N), p = fluid density (kg/m*), A = the frontal area of the object 
on a plane perpendicular to the direction of motion (m°), C; = a dimensionless drag coef- 
ficient, and v = a unit vector indicating the direction of velocity. 

This relationship, which assumes turbulent conditions (1.e., a high Reynolds number), 
allows us to express the lumped drag coefficient from Eq. (1.7) in more fundamental terms 
as 


c4=2 PAC; (1.18) 


Thus, the lumped drag coefficient depends on the object’s area, the fluid’s density, and a 
dimensionless drag coefficient. The latter accounts for all the other factors that contribute 
to air resistance such as the object’s “roughness.” For example, a jumper wearing a baggy 
outfit will have a higher C, than one wearing a sleek jumpsuit. 

Note that for cases where velocity is very low, the flow regime around the object will 
be laminar and the relationship between the drag force and velocity becomes linear. This is 
referred to as Stokes drag. 

In developing our bungee jumper model, we assumed that the downward direction was 
positive. Thus, Eq. (1.7) is an accurate representation of Eq. (1.17), because 0 = +1 and the 
drag force is negative. Hence, drag reduces velocity. 

But what happens if the jumper has an upward (i.e., negative) velocity? In this case, 
D = —1 and Eq. (1.17) yields a positive drag force. Again, this is physically correct as the 
positive drag force acts downward against the upward negative velocity. 

Unfortunately, for this case, Eq. (1.7) yields a negative drag force because it does not 
include the unit directional vector. In other words, by squaring the velocity, its sign and 
hence its direction is lost. Consequently, the model yields the physically unrealistic result 
that air resistance acts to accelerate an upward velocity! 

In this case study, we will modify our model so that it works properly for both downward 
and upward velocities. We will test the modified model for the same case as Example 1.2, but 
with an initial value of (0) = —40 m/s. In addition, we will also illustrate how we can extend 
the numerical analysis to determine the jumper’s position. 


Solution. The following simple modification allows the sign to be incorporated into the 
drag force: 


F,= + poll AC, (1.19) 
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1.4 CASE STUDY continued 


or in terms of the lumped drag: 


F,=—c,v\p| (1.20) 


Thus, the differential equation to be solved is 


E 
2 = g — viel (1.21) 


In order to determine the jumper’s position, we recognize that distance traveled, 
x (m), is related to velocity by 

de _ 

dt 

In contrast to velocity, this formulation assumes that upward distance is positive. In 
the same fashion as Eq. (1.12), this equation can be integrated numerically with Euler’s 
method: 


a (1.22) 


X41 =X; — v(t, )At (1.23) 


Assuming that the jumper’s initial position is defined as x(0) = 0, and using the parameter 
values from Examples 1.1 and 1.2, the velocity and distance at t = 2 s can be computed as 


»(2) = —40 + 19.81 — ae (—40)(40)| 2 = 8.6326 m/s 


x(2) = 0 — (—40)2 = 80 m 


Note that if we had used the incorrect drag formulation, the results would be —32.1274 m/s 
and 80 m. 
The computation can be repeated for the next interval (t = 2 to 4 s): 


v(4) = —8.6326 + |9.81 — a (—8.6326)(8.6326)| 2 = 11.5346 m/s 


x(4) = 80 — (-8.6326)2 = 97.2651 m 


The incorrect drag formulation gives —20.0858 m/s and 144.2549 m. 

The calculation is continued and the results shown in Fig. 1.7 along with those obtained 
with the incorrect drag model. Notice that the correct formulation decelerates more rapidly 
because drag always diminishes the velocity. 

With time, both velocity solutions converge on the same terminal velocity because 
eventually both are directed downward in which case, Eq. (1.7) is correct. However, the 
impact on the height prediction is quite dramatic with the incorrect drag case resulting in 
a much higher trajectory. 

This case study demonstrates how important it is to have the correct physical model. In 
some cases, the solution will yield results that are clearly unrealistic. The current example 
is more insidious as there is no visual evidence that the incorrect solution is wrong. That is, 
the incorrect solution “looks” reasonable. 
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1.4 CASE STUDY continued 


(a) Velocity, m/s 
60 - 


Correct drag 
AAO) |= 


v, m/s 
T 


Incorrect drag 


(b) Height, m 


200 m Incorrect drag 


100 


—100 


Correct drag 


=2000 


FIGURE 1.7 

Plots of (a) velocity and (b) height for the free-falling bungee jumper with an upward 
(negative) initial velocity generated with Euler’s method. Results for both the correct (Eq. 1.20) 
and incorrect (Eq. 1.7) drag formulations are displayed. 
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PROBLEMS 


1.1 Use calculus to verify that Eq. (1.9) is a solution of 
Eq. (1.8) for the initial condition v(0) = 0. 

1.2 Use calculus to solve Eq. (1.21) for the case where the ini- 
tial velocity is (a) positive and (b) negative. (c) Based on your 
results for (a) and (b), perform the same computation as in 
Example 1.1 but with an initial velocity of —40 m/s. Compute 
values of the velocity from ¢ = 0 to 12 s at intervals of 2 s. Note 
that for this case, the zero velocity occurs at t = 3.470239 s. 
1.3 The following information is available for a bank account: 


Date Deposits Withdrawals Balance 
5/1 1512:33 
220.13 327.26 
6/1 
216.80 378.61 
7/1 
450.25 106.80 
8/1 
127.31 350.61 
9/1 


Note that the money earns interest which is computed as 


Interest = iB, 


where i = the interest rate expressed as a fraction per month, 

and B, the initial balance at the beginning of the month. 

(a) Use the conservation of cash to compute the balance on 
6/1,7/1,8/1, and 9/1 if the interest rate is 1% per month 
(i = 0.01/month). Show each step in the computation. 

(b) Write a differential equation for the cash balance in the 
form 


dB _ ; 
“Ge TIPO WO. il 


where t = time (months), D(t) = deposits as a function 
of time ($/month), W(t) = withdrawals as a function of 
time ($/month). For this case, assume that interest is 
compounded continuously; that is, interest = iB. 

(c) Use Euler’s method with a time step of 0.5 month to 
simulate the balance. Assume that the deposits and with- 
drawals are applied uniformly over the month. 

(d) Develop a plot of balance versus time for (a) and (c). 

1.4 Repeat Example 1.2. Compute the velocity to t = 12 s, 

with a step size of (a) 1 and (b) 0.5 s. Can you make any 

statement regarding the errors of the calculation based on 
the results? 


1.5 Rather than the nonlinear relationship of Eq. (1.7), you 
might choose to model the upward force on the bungee 
jumper as a linear relationship: 


er) 
Fy = —c'v 


where c’ = a first-order drag coefficient (kg/s). 
(a) Using calculus, obtain the closed-form solution for the 
case where the jumper is initially at rest (v = 0 at t = 0). 
(b) Repeat the numerical calculation in Example 1.2 with 
the same initial condition and parameter values. Use a 
value of 11.5 kg/s for c’. 
1.6 For the free-falling bungee jumper with linear drag 
(Prob. 1.5), assume a first jumper is 70 kg and has a drag co- 
efficient of 12 kg/s. If a second jumper has a drag coefficient 
of 15 kg/s and a mass of 80 kg, how long will it take her to 
reach the same velocity jumper 1 reached in 9 s? 
1.7 For the second-order drag model (Eq. 1.8), compute the 
velocity of a free-falling parachutist using Euler’s method 
for the case where m = 80 kg and c, = 0.25 kg/m. Perform 
the calculation from t = 0 to 20 s with a step size of 1 s. Use 
an initial condition that the parachutist has an upward veloc- 
ity of 20 m/s at t = 0. Att = 10 s, assume that the chute is 
instantaneously deployed so that the drag coefficient jumps 
to 1.5 kg/m. 
1.8 The amount of a uniformly distributed radioactive con- 
taminant contained in a closed reactor is measured by its 
concentration c (becquerel/liter or Bq/L). The contaminant 
decreases at a decay rate proportional to its concentration; 
that is 


Decay rate = —kc 


where k is a constant with units of day~'. Therefore, ac- 
cording to Eq. (1.14), a mass balance for the reactor can be 
written as 


dc 

ae = ke 

dt i 
| change | | decrease | 
in mass/~ \by decay 


(a) Use Euler’s method to solve this equation from t = 0 to 
1 d with k = 0.175 d'. Employ a step size of At = 0.1 d. 
The concentration at t = 0 is 100 Bq/L. 

(b) Plot the solution on a semilog graph (i.e., In c versus f) 
and determine the slope. Interpret your results. 

1.9 A storage tank (Fig. P1.9) contains a liquid at depth y 

where y = 0 when the tank is half full. Liquid is withdrawn 

at a constant flow rate Q to meet demands. The contents are 
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FIGURE P1.9 


resupplied at a sinusoidal rate 3Q sin?(t). Equation (1.14) 
can be written for this system as 


d(Ay) _ +2 
P =3Q sin“(t) Q 
change in| _ _ 
| volame J (inflow) — (outflow) 


or, since the surface area A is constant 


dy _ 40 


d ~ A 


sin’*(t) — 2 


Use Euler’s method to solve for the depth y from t = 0 to 
10 d with a step size of 0.5 d. The parameter values are A = 
1250 m° and Q = 450 m°/d. Assume that the initial condition 
isy=0. 

1.10 For the same storage tank described in Prob. 1.9, sup- 
pose that the outflow is not constant but rather depends on 
the depth. For this case, the differential equation for depth 
can be written as 


dy _49, 


Cio, wey 
Boa sin“(f) 


A 


Use Euler’s method to solve for the depth y from t = 0 to 
10 d with a step size of 0.5 d. The parameter values are A = 
1250 m?, Q=450 m°/d, and a = 150. Assume that the initial 
condition is y = 0. 

1.11 Apply the conservation of volume (see Prob. 1.9) to sim- 
ulate the level of liquid in a conical storage tank (Fig. P1.11). 


FIGURE P1.11 


The liquid flows in at a sinusoidal rate of Q, = 3 sin°(t) and 
flows out according to 


Qout = 30 E Jad” 
Qout = 0 


y > You 
y Sout 


where flow has units of m*/d and y = the elevation of the 
water surface above the bottom of the tank (m). Use Euler’s 
method to solve for the depth y from ¢ = 0 to 10 d with a step 
size of 0.5 d. The parameter values are rop = 2.5 M, Yio, = 4m, 
and y,,, = | m. Assume that the level is initially below the 
outlet pipe with y(0) = 0.8 m. 

1.12 A group of 35 students attend a class in an insu- 
lated room which measures 11 by 8 by 3 m. Each student 
takes up about 0.075 m° and gives out about 80 W of heat 
(1 W = 1 J/s). Calculate the air temperature rise during 
the first 20 minutes of the class if the room is completely 
sealed and insulated. Assume the heat capacity C, for air is 
0.718 kJ/(kg K). Assume air is an ideal gas at 20 °C and 
101.325 kPa. Note that the heat absorbed by the air Q is 
related to the mass of the air m the heat capacity, and the 
change in temperature by the following relationship: 


T, 
Q= mf,  C,dT =mC,(T, - T,) 
The mass of air can be obtained from the ideal gas law: 


__m 
PV= Meat RT 

where P is the gas pressure, V is the volume of the gas, Mwt 

is the molecular weight of the gas (for air, 28.97 kg/kmol), 

and R is the ideal gas constant [8.314 kPa m*/(kmol K)]. 
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FIGURE P1.13 


1.13 Figure P1.13 depicts the various ways in which an 
average man gains and loses water in one day. One liter 
is ingested as food, and the body metabolically produces 
0.3 liters. In breathing air, the exchange is 0.05 liters while 
inhaling, and 0.4 liters while exhaling over a one-day period. 
The body will also lose 0.3, 1.4, 0.2, and 0.35 liters through 
sweat, urine, feces, and through the skin, respectively. To 
maintain steady state, how much water must be drunk per day? 
1.14 In our example of the free-falling bungee jumper, we 
assumed that the acceleration due to gravity was a constant 
value of 9.81 m/s”. Although this is a decent approxima- 
tion when we are examining falling objects near the surface 
of the earth, the gravitational force decreases as we move 
above sea level. A more general representation based on 
Newton’s inverse square law of gravitational attraction can 
be written as 
R2 

(R +x) 

where g(x) = gravitational acceleration at altitude x (in m) 

measured upward from the earth’s surface (m/s”), g(0) = 

gravitational acceleration at the earth’s surface (¥9.81 m/s’), 

and R = the earth’s radius (6.37 x 10° m). 

(a) Ina fashion similar to the derivation of Eq. (1.8), use a 
force balance to derive a differential equation for veloc- 
ity as a function of time that utilizes this more complete 
representation of gravitation. However, for this deriva- 
tion, assume that upward velocity is positive. 

(b) For the case where drag is negligible, use the chain rule 
to express the differential equation as a function of alti- 
tude rather than time. Recall that the chain rule is 


g(x) = g(0) 


do _ dv dx 
dt dx dt 


(c) Use calculus to obtain the closed form solution where 
v= vy atx=0. 

(d) Use Euler’s method to obtain a numerical solution from 
x = 0 to 100,000 m using a step of 10,000 m where the 
initial velocity is 1500 m/s upward. Compare your result 
with the analytical solution. 

1.15 Suppose that a spherical droplet of liquid evaporates at 

a rate that is proportional to its surface area. 


dV p 

Pa —kA 
where V = volume (mm°), t = time (min), k = the evapo- 
ration rate (mm/min), and A = surface area (mm”). Use 
Euler’s method to compute the volume of the droplet from 
t = 0 to 10 min using a step size of 0.25 min. Assume that 
k = 0.08 mm/min and that the droplet initially has a radius of 
2.5 mm. Assess the validity of your results by determining 
the radius of your final computed volume and verifying that 
it is consistent with the evaporation rate. 
1.16 A fluid is pumped into the network shown in Fig. P1.16. 
If Q, = 0.7, Q, = 0.5, Q} = 0.1, and Q; = 0.3 m/s, determine 
the other flows. 
1.17 Newton’s law of cooling says that the temperature of 
a body changes at a rate proportional to the difference be- 
tween its temperature and that of the surrounding medium 
(the ambient temperature), 


where T = the temperature of the body (°C), t = time (min), 
k = the proportionality constant (per minute), and T, = the 
ambient temperature (°C). Suppose that a cup of coffee 
originally has a temperature of 70 °C. Use Euler’s method to 
compute the temperature from ¢ = 0 to 20 min using a step 
size of 2 min if T, = 20°C and k = 0.019/min. 


=P = —p 
Qi Q3 Qs 
Qio Qo Qs 
—— -q -q 
FIGURE P1.16 
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1.18 You are working as a crime scene investigator and 
must predict the temperature of a homicide victim over a 
5-hr period. You know that the room where the victim was 
found was at 10°C when the body was discovered. 

(a) Use Newton’s law of cooling (Prob. 1.17) and Euler’s 
method to compute the victim’s body temperature for the 
5-hr period using values of k = 0.12/hr and At = 0.5 hr. 
Assume that the victim’s body temperature at the time 
of death was 37°C, and that the room temperature was 
at a constant value of 10 °C over the 5-hr period. 

(b) Further investigation reveals that the room temperature 
had actually dropped linearly from 20 to 10 °C over the 
5-hr period. Repeat the same calculation as in (a) but 
incorporate this new information. 

(c) Compare the results from (a) and (b) by plotting them 
on the same graph. 

1.19 The velocity is equal to the rate of change of distance, 

x (m): 


dx _ 
a 


(P1.19) 
Use Euler’s method to numerically integrate Eqs. (P1.19) 
and (1.8) in order to determine both the velocity and distance 
fallen as a function of time for the first 10 seconds of freefall 
using the same parameters and conditions as in Example 1.2. 
Develop a plot of your results. 

1.20 In addition to the downward force of gravity (weight) 

and drag, an object falling through a fluid is also subject 

to a buoyancy force which is proportional to the displaced 
volume (Archimedes’ principle). For example, for a sphere 
with diameter d (m), the sphere’s volume is V = xd°/6, and 
its projected area is A = xd?/4. The buoyancy force can then 
be computed as F, = —pVg. We neglected buoyancy in our 
derivation of Eq. (1.8) because it is relatively small for an 
object like a bungee jumper moving through air. However, 
for a more dense fluid like water, it becomes more prominent. 

(a) Derive a differential equation in the same fashion as 
Eq. (1.8), but include the buoyancy force and represent 
the drag force as described in Sec. 1.4. 

(b) Rewrite the differential equation from (a) for the special 
case of a sphere. 

(c) Use the equation developed in (b) to compute the terminal 
velocity (i.e., for the steady-state case). Use the follow- 
ing parameter values for a sphere falling through water: 
sphere diameter = 1 cm, sphere density = 2700 kg/m’, 
water density = 1000 kg/m’, and C, = 0.47. 

(d) Use Euler’s method with a step size of At = 0.03125 s 
to numerically solve for the velocity from f = 0 to 0.25 s 
with an initial velocity of zero. 


1.21 As noted in Sec. 1.4, a fundamental representation of 
the drag force, which assumes turbulent conditions (i.e., a 
high Reynolds number), can be formulated as 


F= -3 PAC, vlo 


where F, = the drag force (N), p = fluid density (kg/m*), A = 

the frontal area of the object on a plane perpendicular to the 

direction of motion (m°, v = velocity (m/s), and C} = a dimen- 
sionless drag coefficient. 

(a) Write the pair of differential equations for velocity and 
position (see Prob. 1.19) to describe the vertical motion of 
a sphere with diameter, d (m), and a density of p, (kg/m°). 
The differential equation for velocity should be written as 
a function of the sphere’s diameter. 

(b) Use Euler’s method with a step size of At = 2 s to com- 
pute the position and velocity of a sphere over the first 
14 seconds. Employ the following parameters in your 
calculation: d = 120 cm, p = 1.3 kg/m’, p, = 2700 kg/m’, 
and C, = 0.47. Assume that the sphere has the initial 
conditions: x(0) = 100 m and v(0) = —40 m/s. 

(c) Develop a plot of your results (i.e., y and v versus t) and 
use it to graphically estimate when the sphere would hit 
the ground. 

(d) Compute the value for the bulk second-order drag coef- 
ficient, c,’ (kg/m). Note that the bulk second-order drag 
coefficient is the term in the final differential equation 
for velocity that multiplies the term v lol. 

1.22 As depicted in Fig. P1.22, a spherical particle set- 
tling through a quiescent fluid is subject to three forces: the 
downward force of gravity (Fç), and the upward forces of 
buoyancy (Fp) and drag (Fp). Both the gravity and buoyancy 
forces can be computed with Newton’s second law with the 
latter equal to the weight of the displaced fluid. For laminar 
flow, the drag force can be computed with Stoke’s law, 


Fp = 3adv 


where ju = the dynamic viscosity of the fluid (N s/m?), d = 
the particle diameter (m), and v = the particle’s settling 
velocity (m/s). The mass of the particle can be expressed as 
the product of the particle’s volume and density, p, (kg/m*), 
and the mass of the displaced fluid can be computed as the 
product of the particle’s volume and the fluid’s density, 
p (kg/m). The volume of a sphere is zd°/6. In addition, 
laminar flow corresponds to the case where the dimension- 
less Reynolds number, Re, is less than 1, where Re = pdo/p. 
(a) Use a force balance for the particle to develop the dif- 

ferential equation for dv/dt as a function of d, p, p,, and p. 
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FIGURE P1.22 


(b) At steady-state, use this equation to solve for the par- 
ticle’s terminal velocity. 

(c) Employ the result of (b) to compute the particle’s ter- 
minal velocity in m/s for a spherical silt particle settling 
in water: d = 10 pm, p = 1 g/cm’, p, = 2.65 g/cm’, and 
4 = 0.014 g/(cm:s). 

(d) Check whether flow is laminar. 

(e) Use Euler’s method to compute the velocity from ¢ = 0 to 
2715 s with At=27'* s given the initial condition: v(0) = 0. 

1.23 As depicted in Fig. P1.23, the downward deflection, 

y (m), of a cantilever beam with a uniform load, w = 10,000 

kg/m, can be computed as 


>W rA 3 2 
Y= aay ALX + 6L7x°) 
where x = distance (m), E = the modulus of elasticity = 
2x 10!! Pa, Z = moment of inertia = 3.25 x 107 mî, and L = 
length = 4 m. This equation can be differentiated to yield the 
slope of the downward deflection as a function of x 


dy_ w 3 2 2 
If y= 0 atx = 0, use this equation with Euler’s method (Ax = 
0.125 m) to compute the deflection from x = 0 to L. Develop 
a plot of your results along with the analytical solution com- 
puted with the first equation. 


a0) ee 
0 
w 
yy 
FIGURE P1.23 


A cantilever beam. 


1.24 Use Archimedes’ principle to develop a steady-state 
force balance for a spherical ball of ice floating in seawater. 
The force balance should be expressed as a third-order poly- 
nomial (cubic) in terms of height of the cap above the water 
line (h), and the seawater’s density P) the ball’s density 
(p,) and radius (r). 

1.25 Beyond fluids, Archimedes’ principle has proven 
useful in geology when applied to solids on the earth’s 
crust. Figure P1.25 depicts one such case where a lighter 
conical granite mountain “floats on” a denser basalt layer 
at the earth’s surface. Note that the part of the cone below 
the surface is formally referred to as a frustum. Develop a 
steady-state force balance for this case in terms of the fol- 
lowing parameters: basalt’s density (p,), granite’s density 
(p,), the cone’s bottom radius (r), and the height above (/,) 
and below (h,) the earth’s surface. 


FIGURE P1.24 
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FIGURE P1.25 


1.26 As depicted in Fig. P1.26, an RLC circuit consists of 
three elements: a resistor (R), an inductor (L), and a capaci- 
tor (C). The flow of current across each element induces a 
voltage drop. Kirchhoff’s second voltage law states that the 
algebraic sum of these voltage drops around a closed circuit 
is zero, 

; di, 4 

iR + Li + € =0 
where i = current, R = resistance, L = inductance, t = time, 
q = charge, and C = capacitance. In addition, the current is 
related to charge as in 


dq . 
at _ ij 
dt 


(a) If the initial values are i(0) = 0 and q(0) = 1 C, use 
Euler’s method to solve this pair of differential equa- 
tions from ¢ = 0 to 0.1 s using a step size of At = 0.01 s. 
Employ the following parameters for your calculation: 
R =200 9, L = 5 H, and C = 10* F. 

(b) Develop a plot of i and q versus t. 

1.27 Suppose that a parachutist with linear drag (m = 70 kg, 

c = 12.5 kg/s) jumps from an airplane flying at an altitude 

of 200 m with a horizontal velocity of 180 m/s relative to 

the ground. 

(a) Write a system of four differential equations for x, y, v, = 
dx/dt and v, = dy/dt. 

(b) If the initial horizontal position is defined as x = 0, use 
Euler’s methods with At = 1 s to compute the jumper’s 
position over the first 10 seconds. 


Resistor Inductor Capacitor 
. di q 
R L= £ 
i dt C 


FIGURE P1.26 


(c) Develop plots of y versus ¢ and y versus x. Use the plot to 
graphically estimate when and where the jumper would 
hit the ground if the chute failed to open. 

1.28 Figure P1.28 shows the forces exerted on a hot air bal- 

loon system. 


Fg 


| 


t 
EA 
T 
f 


FIGURE P1.28 

Forces on a hot air balloon: Fz = buoyancy, Fg = weight 
of gas, F, = weight of payload (including the balloon 
envelope), and Fp = drag. Note that the direction of the 
drag is downward when the balloon is rising. 
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Formulate the drag force as 


where p, = air density (kg/m*), v = velocity (m/s), A = pro- 
jected frontal area (m°), and C; = the dimensionless drag co- 
efficient (= 0.47 for a sphere). Note also that the total mass 
of the balloon consists of two components: 


m= mg + Mp 


where mg = the mass of the gas inside the expanded balloon 
(kg), and m, = the mass of the payload (basket, passengers, 
and the unexpanded balloon = 265 kg). Assume that the 
ideal gas law holds (P = pRT), that the balloon is a perfect 
sphere with a diameter of 17.3 m, and that the heated air 


inside the envelope is at roughly the same pressure as the 
outside air. 
Other necessary parameters are: 


Normal atmospheric pressure, P = 101,300 Pa 

The gas constant for dry air, R = 287 Joules/(kg K) 

The air inside the balloon is heated to an average tempera- 
ture, T = 100 °C 

The normal (ambient) air density, p = 1.2 kg/m’. 


(a) Use a force balance to develop the differential equa- 
tion for dv/dt as a function of the model’s fundamental 
parameters. 

(b) At steady-state, calculate the particle’s terminal velocity. 

(c) Use Euler’s method and Excel to compute the velocity 
from t = 0 to 60 s with At = 2 s given the previous 
parameters along with the initial condition: v(0) = 0. 
Develop a plot of your results. 
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CHAPTER OBJECTIVES 


The primary objective of this chapter is to provide an introduction and overview of 
how MATLAB’s calculator mode is used to implement interactive computations. 
Specific objectives and topics covered are 


Learning how real and complex numbers are assigned to variables. 


Learning how vectors and matrices are assigned values using simple assignment, 
the colon operator, and the linspace and logspace functions. 

Understanding the priority rules for constructing mathematical expressions. 
Gaining a general understanding of built-in functions and how you can learn more 
about them with MATLAB’s Help facilities. 

Learning how to use vectors to create a simple line plot based on an equation. 


YOU’VE GOT A PROBLEM 


n Chap. 1, we used a force balance to determine the terminal velocity of a free-falling 
object like a bungee jumper: 


gm 
t C4 


where v, = terminal velocity (m/s), g = gravitational acceleration (m/s”), m = mass (kg), 
and c4 = a drag coefficient (kg/m). Aside from predicting the terminal velocity, this equa- 
tion can also be rearranged to compute the drag coefficient 


c= (2.1) 
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TABLE 2.1 Data for the mass and associated terminal velocities of a number of jumpers. 


m, kg 83.6 60.2 Teal 91.1 92.9 65.3 80.9 
V, m/s 53.4 48.5 50.9 59: 7 54 47.7 51.1 


Thus, if we measure the terminal velocity of a number of jumpers of known mass, this 
equation provides a means to estimate the drag coefficient. The data in Table 2.1 were col- 
lected for this purpose. 

In this chapter, we will learn how MATLAB can be used to analyze such data. Beyond 
showing how MATLAB can be employed to compute quantities like drag coefficients, we 
will also illustrate how its graphical capabilities provide additional insight into such analyses. 


THE MATLAB ENVIRONMENT 


MATLAB is a computer program that provides the user with a convenient environment for 
performing many types of calculations. In particular, it provides a very nice tool to imple- 
ment numerical methods. 

The most common way to operate MATLAB is by entering commands one at a time 
in the command window. In this chapter, we use this interactive or calculator mode to in- 
troduce you to common operations such as performing calculations and creating plots. In 
Chap. 3, we show how such commands can be used to create MATLAB programs. 

One further note. This chapter has been written as a hands-on exercise. That is, you 
should read it while sitting in front of your computer. The most efficient way to become 
proficient is to actually implement the commands on MATLAB as you proceed through 
the following material. 

MATLAB uses three primary windows: 


e Command window. Used to enter commands and data. 
e Graphics window. Used to display plots and graphs. 
e Edit window. Used to create and edit M-files. 


In this chapter, we will make use of the command and graphics windows. In Chap. 3 we 
will use the edit window to create M-files. 

After starting MATLAB, the command window will open with the command prompt 
being displayed 

>> 
The calculator mode of MATLAB operates in a sequential fashion as you type in com- 


mands line by line. For each command, you get a result. Thus, you can think of it as operat- 
ing like a very fancy calculator. For example, if you type in 


>> 55 - 16 
MATLAB will display the result’ 
ans = 
39 


! MATLAB skips a line between the label (ans =) and the number (39). Here, we omit such blank lines for 
conciseness. You can control whether blank lines are included with the format compact and format loose 
commands. 
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2.2 


Notice that MATLAB has automatically assigned the answer to a variable, ans. Thus, you 
could now use ans in a subsequent calculation: 


>> ans + 11 
with the result 


ans = 
50 


MATLAB assigns the result to ans whenever you do not explicitly assign the calculation to 
a variable of your own choosing. 


ASSIGNMENT 

Assignment refers to assigning values to variable names. This results in the storage of the 
values in the memory location corresponding to the variable name. 

2.2.1 Scalars 


The assignment of values to scalar variables is similar to other computer languages. 
Try typing 

>> a= 4 
Note how the assignment echo prints to confirm what you have done: 


a= 
4 


Echo printing is a characteristic of MATLAB. It can be suppressed by terminating the com- 
mand line with the semicolon (;) character. Try typing 


>> A=6; 


You can type several commands on the same line by separating them with commas or 
semicolons. If you separate them with commas, they will be displayed, and if you use the 
semicolon, they will not. For example, 


>> a= 4,A = 6;x =1; 


a= 
4 


MATLAB treats names in a case-sensitive manner—that is, the variable a is not the 
same as A. To illustrate this, enter 


> a 
and then enter 
> A 


See how their values are distinct. They are distinct names. 
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We can assign complex values to variables, since MATLAB handles complex arith- 
metic automatically. The unit imaginary number V—1 is preassigned to the variable i. 
Consequently, a complex value can be assigned simply as in 


>> x = 2+į*4 


xX = 
2.0000 + 4.00001 


It should be noted that MATLAB allows the symbol j to be used to represent the unit 
imaginary number for input. However, it always uses an i for display. For example, 


>> x = 2+j*4 


xX = 
2.0000 + 4.00001 


There are several predefined variables, for example, pi. 
>> pi 


ans = 
3.1416 


Notice how MATLAB displays four decimal places. If you desire additional precision, 
enter the following: 


>> format long 
Now when pi is entered the result is displayed to 15 significant figures: 
>> pi 


ans = 
3.14159265358979 


To return to the four decimal version, type 
>> format short 


The following is a summary of the format commands you will employ routinely in engi- 
neering and scientific calculations. They all have the syntax: format type. 


type Result Example 
short Scaled fixed-point format with 5 digits 3.1416 
long Scaled fixed-point format with 15 digits for double and 7 digits for single 3.14159265358979 
short e Floating-point format with 5 digits 3.1416e+000 
long e Floating-point format with 15 digits for double and 7 digits for single 3.141592653589793e+000 
short g Best of fixed- or floating-point format with 5 digits 3.1416 
long g Best of fixed- or floating-point format with 15 digits for double 3.14159265358979 
and 7 digits for single 
short eng Engineering format with at least 5 digits and a power that is a multiple of 3 3.1416e+000 
long eng Engineering format with exactly 16 significant digits and a power 3.14159265358979e+000 
that is a multiple of 3 
bank Fixed dollars and cents 3.14 
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2.2.2 Arrays, Vectors, and Matrices 


An array is a collection of values that are represented by a single variable name. 
One-dimensional arrays are called vectors and two-dimensional arrays are called 
matrices. The scalars used in Sec. 2.2.1 are actually matrices with one row and one 
column. 

Brackets are used to enter arrays in the command mode. For example, a row vector can 
be assigned as follows: 


>>a=[12345] 
a= 


1 2 3 4 5 


Note that this assignment overrides the previous assignment of a = 4. 

In practice, row vectors are rarely used to solve mathematical problems. When we 
speak of vectors, we usually refer to column vectors, which are more commonly used. A 
column vector can be entered in several ways. Try them. 


>> b = [2;4;6;8;10] 
or 


>> b= [2 
4 
6 
8 
10] 
or, by transposing a row vector with the ' operator, 
>> b= [2468 10]' 
The result in all three cases will be 


b= 


oon ky 


1 
A matrix of values can be assigned as follows: 


>> A= [123;456;789] 


A= 
1 2 3 
4 5 6 
7 8 9 


In addition, the Enter key (carriage return) can be used to separate the rows. For example, 
in the following case, the Enter key would be struck after the 3, the 6, and the ] to assign 
the matrix: 
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Finally, we could construct the same matrix by concatenating (i.e., joining) the vectors 
representing each column: 


>> A=[[147]' [25 8]' [3 6 9]'] 


At any point in a session, a list of all current variables can be obtained by entering the 
who command: 


>> who 


Your variables are: 
A a ans b X 


or, with more detail, enter the whos command: 


>> whos 
Name Size Bytes Class 
A 3x3 72 double array 
a 1x5 40 double array 
ans 1x1 8 double array 
b 5x1 40 double array 
x 1x1 16 double array (complex) 


Grand total is 21 elements using 176 bytes 


Note that subscript notation can be used to access an individual element of an array. 
For example, the fourth element of the column vector b can be displayed as 


>> b(4) 


ans = 
8 


For an array, A(m,n) selects the element in mth row and the nth column. For example, 
>> A(2,3) 


ans = 
6 


There are several built-in functions that can be used to create matrices. For exam- 
ple, the ones and zeros functions create vectors or matrices filled with ones and zeros, 
respectively. Both have two arguments, the first for the number of rows and the second for 
the number of columns. For example, to create a 2 x 3 matrix of zeros: 


>> E = zeros(2,3) 


E= 
0 0 0 
0 0 0 


Similarly, the ones function can be used to create a row vector of ones: 
>> u = ones(1,3) 


u= 
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2.2.3 The Colon Operator 


The colon operator is a powerful tool for creating and manipulating arrays. If a colon is 
used to separate two numbers, MATLAB generates the numbers between them using an 
increment of one: 


>> t=1:5 


es 
1 2 3 4 5 


If colons are used to separate three numbers, MATLAB generates the numbers between the 
first and third numbers using an increment equal to the second number: 


>> t = 1:0.5:3 


i 
1.0000 1.5000 2.0000 2.5000 3.0000 


Note that negative increments can also be used 


>> t = 10:-1:5 


10 9 8 7 6 5 


Aside from creating series of numbers, the colon can also be used as a wildcard to 
select the individual rows and columns of a matrix. When a colon is used in place of a 
specific subscript, the colon represents the entire row or column. For example, the second 
row of the matrix A can be selected as in 


>> A(2,:) 


ans = 
4 5 6 


We can also use the colon notation to selectively extract a series of elements from 
within an array. For example, based on the previous definition of the vector t: 


>> t(2:4) 


ans = 
9 8 7 


Thus, the second through the fourth elements are returned. 


2.2.4 The linspace and logspace Functions 


The linspace and logspace functions provide other handy tools to generate vectors of spaced 
points. The linspace function generates a row vector of equally spaced points. It has the 
form 


linspace(x1, x2, n) 
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which generates n points between x1 and x2. For example 
>> linspace(0,1,6) 


ans = 
0 0.2000 0.4000 0.6000 0.8000 1.0000 


If the nis omitted, the function automatically generates 100 points. 
The logspace function generates a row vector that is logarithmically equally spaced. It 
has the form 


logspace(x1, x2, n) 


which generates n logarithmically equally spaced points between decades 10% and 10%. 
For example, 


>> logspace(-1,2,4) 


ans = 
0.1000 1.0000 10.0000 100.0000 


If nis omitted, it automatically generates 50 points. 


2.2.5 Character Strings 


Aside from numbers, alphanumeric information or character strings can be represented by 
enclosing the strings within single quotation marks. For example, 


>> f = ‘Miles '; 
>> s = 'Davis'; 


Each character in a string is one element in an array. Thus, we can concatenate (i.e., paste 
together) strings as in 


>> x= [f s] 


xX = 
Miles Davis 


Note that very long lines can be continued by placing an ellipsis (three consecutive 
periods) at the end of the line to be continued. For example, a row vector could be entered as 


>> a=(12345... 
6 7 8] 


a = 
ul 2 3 4 5 6 7 8 


However, you cannot use an ellipsis within single quotes to continue a string. To enter a 
string that extends beyond a single line, piece together shorter strings as in 


>> quote = ['Any fool can make a rule,' ... 
' and any fool will mind it'] 


quote = 
Any fool can make a rule, and any fool will mind it 
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TABLE 2.2 Some useful string functions. 


Function 


Description 


n=length(s) 
b=strcmp(s1,s2) 


n=str2num 
s=num2str 
s2=strrep 
i=strfind 


) 
) 
1,cl,c2) 


( 
( 
( 
(s1,s2) 


nuns wn 


S=upper (s) 
s=lower(S) 


Number of characters, n, in a String, s. 


Compares two strings, s1 and s2; if equal returns true (b = 1). If not equal, 
returns false (b = 0). 


Conver 


Conver 


s a string, s, to a number, n. 


s a number, n, to a string, S. 


Replaces characters in a string with different characters. 


Returns 


string s1. 


Conver 


Conver 


the starting indices of any occurrences of the string s2 in the 


s a string to upper case. 


s a string to lower case. 


A number of built-in MATLAB functions are available to operate on strings. Table 2.2 
lists a few of the more commonly used ones. For example, 


>> x1 = ‘Canada’; x2 = ‘Mexico’; x3 = ‘USA’; x4 = ‘2010’; x5 = 810; 


>> strcmp(al,a2) 


ans = 


0 


>> strcmp(x2, ‘Mexico’ ) 


ans = 
1 

>> str2num(x4) 
ans = 

2010 

>> num2str (x5) 
ans = 

810 

>> strrep 

>> lower 


>> upper 


Note, if you want to display strings in multiple lines, use the sprint function and insert 
the two-character sequence \n between the strings. For example, 


>> disp(sprintf('Yo\nAdrian!')) 


yields 
Yo 


Adrian! 


36 


MATLAB FUNDAMENTALS 


2.3 


MATHEMATICAL OPERATIONS 


Operations with scalar quantities are handled in a straightforward manner, similar to other 
computer languages. The common operators, in order of priority, are 


Exponentiation 


= Negation 

* / Multiplication and division 
\ Left division? 

+- Addition and subtraction 


These operators will work in calculator fashion. Try 
>> 2*pi 


ans = 
6.2832 


Also, scalar real variables can be included: 


>> y = pi/4; 

>> y ^ 2.45 

ans = 
0.5533 


Results of calculations can be assigned to a variable, as in the next-to-last example, or 
simply displayed, as in the last example. 

As with other computer calculation, the priority order can be overridden with paren- 
theses. For example, because exponentiation has higher priority than negation, the follow- 
ing result would be obtained: 


>> y = -4^2 


y= 
-16 


Thus, 4 is first squared and then negated. Parentheses can be used to override the priorities 
as in 
>> y = (-4) ^2 


y= 
16 


Within each precedence level, operators have equal precedence and are evaluated from left 
to right. As an example, 


>> 4^2^3 


>> 4^(2^3) 
>> (4^2)^3 


? Left division applies to matrix algebra. It will be discussed in detail later in this book. 
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In the first case 4? = 16 is evaluated first, which is then cubed to give 4096. In the second 
case 2° = 8 is evaluated first and then 48 = 65,536. The third case is the same as the first, 
but uses parentheses to be clearer. 

One potentially confusing operation is negation; that is, when a minus sign is em- 
ployed with a single argument to indicate a sign change. For example, 


>> 2*-4 


The —4 is treated as a number, so you get —8. As this might be unclear, you can use paren- 
theses to clarify the operation 


>> 2*(-A) 
Here is a final example where the minus is used for negation 
>> 24-4 


Again —4 is treated as a number, so 24-4 = 274 = 1/2* = 1/16 = 0.0625. Parentheses can 
make the operation clearer 


>> 2^(-4) 


Calculations can also involve complex quantities. Here are some examples that use the 
values of x (2 + 4i) and y (16) defined previously: 
>> 3 * x 
ans = 
6.0000 + 12.0000i 
>1/x 


ans = 
0.1000 - 0.2000i 

> x^2 

ans = 

-12.0000 + 16.0000i 

>> x+y 


ans = 
18.0000 + 4.0000; 


The real power of MATLAB is illustrated in its ability to carry out vector-matrix 
calculations. Although we will describe such calculations in detail in Chap. 8, it is worth 
introducing some examples here. 

The inner product of two vectors (dot product) can be calculated using the * 
operator, 


>>a*b 


ans = 
110 
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and likewise, the outer product 
>> b*a 


ans = 


CanaN 
m 
N 
Ke 
co 
N 
Ss 
w 
t=) 


1 


To further illustrate vector-matrix multiplication, first redefine a and b: 
>> a= [1 2 3]; 
and 
>> b= [45 6]'; 
Now, try 
> a*A 


ans = 
30 36 42 


or 
> A*b 


ans = 
32 

77 

122 


Matrices cannot be multiplied if the inner dimensions are unequal. Here is what happens 
when the dimensions are not those required by the operations. Try 


> A*a 
MATLAB automatically displays the error message: 


??? Error using ==> mtimes 
Inner matrix dimensions must agree. 


Matrix-matrix multiplication is carried out in likewise fashion: 


> A*A 

ans = 
30 36 42 
66 81 96 


102 126 150 
Mixed operations with scalars are also possible: 


>> A/pi 


ans = 
0.3183 0.6366 0.9549 
1.2732 1.5915 1.9099 
2.2282 2.5465 2.8648 
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2.4 


We must always remember that MATLAB will apply the simple arithmetic operators 
in vector-matrix fashion if possible. At times, you will want to carry out calculations item 
by item in a matrix or vector. MATLAB provides for that too. For example, 


>> Ar2 

ans = 
30 36 42 
66 81 96 
102 126 150 


results in matrix multiplication of A with itself. 
What if you want to square each element of A? That can be done with 


>> A.A2 

ans = 
1 4 9 
16 25 36 
49 64 81 


The . preceding the ^ operator signifies that the operation is to be carried out element by 
element. The MATLAB manual calls these array operations. They are also often referred 
to as element-by-element operations. 

MATLAB contains a helpful shortcut for performing calculations that you’ve already 
done. Press the up-arrow key. You should get back the last line you typed in. 


>> A.A2 


Pressing Enter will perform the calculation again. But you can also edit this line. For 
example, change it to the line below and then press Enter. 


>> A.A3 
ans = 
1 8 27 
64 125 216 
343 512 729 


Using the up-arrow key, you can go back to any command that you entered. Press the up- 
arrow until you get back the line 


>> b*a 
Alternatively, you can type b and press the up-arrow once and it will automatically bring 


up the last command beginning with the letter b. The up-arrow shortcut is a quick way to 
fix errors without having to retype the entire line. 


USE OF BUILT-IN FUNCTIONS 


MATLAB and its Toolboxes have a rich collection of built-in functions. You can use 
online help to find out more about them. For example, if you want to learn about the log 
function, type in 


>> help log 
LOG Natural logarithm. 
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LOG(X) is the natural logarithm of the elements of X. 
Complex results are produced if X is not positive. 


See also LOG2, LOG10, EXP, LOGM. 
For a list of all the elementary functions, type 
>> help elfun 
One of their important properties of MATLAB’s built-in functions is that they will 
operate directly on vector and matrix quantities. For example, try 
>> log(A) 
ans = 
0 0.6931 1.0986 
1.3863 1.6094 1.7918 
1.9459 2.0794 2.1972 
and you will see that the natural logarithm function is applied in array style, element by 
element, to the matrix A. Most functions, such as sqrt, abs, sin, acos, tanh, and exp, operate 
in array fashion. Certain functions, such as exponential and square root, have matrix defini- 
tions also. MATLAB will evaluate the matrix version when the letter m is appended to the 
function name. Try 
>> sqrtm(A) 


ans = 
0.4498 + 0.76231 0.5526 + 0.20681 0.6555 - 0.34873 
1.0185 + 0.08421 1.2515 + 0.02281 1.4844 - 0.0385 
1.5873 - 0.59401 1.9503 - 0.16111 2.3134 + 0.2717i 


There are several functions for rounding. For example, suppose that we enter a vector: 
>> E = [-1.6 -1.5 -1.4 1.4 1.5 1.6]; 
The round function rounds the elements of E to the nearest integers: 


>> round(E) 


ans = 
-2 =2 <1 1 2 2 


The ceil (short for ceiling) function rounds to the nearest integers toward infinity: 
>> ceil(E) 
ans = 
-1 -1 -1 2 2 2 
The floor function rounds down to the nearest integers toward minus infinity: 
>> floor (E) 


ans = 
-2 -2 =2 1 1 1 
There are also functions that perform special actions on the elements of matrices and 
arrays. For example, the sum function returns the sum of the elements: 
> F=[35461]; 
>> sum(F) 


ans = 
19 
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In a similar way, it should be pretty obvious what’s happening with the following 
commands: 


>> min(F),max(F),mean(F),prod(F),sort(F) 


ans = 
ans = 


ans = 
3.8000 


ans = 


1 3 4 5 6 


A common use of functions is to evaluate a formula for a series of arguments. Recall 
that the velocity of a free-falling bungee jumper can be computed with [Eq. (1.9)]: 


_ (gm [Ra ,) 
v= tant | “=m t 


where v is velocity (m/s), g is the acceleration due to gravity (9.81 m/s”), m is mass (kg), 
c, is the drag coefficient (kg/m), and ż is time (s). 

Create a column vector t that contains values from 0 to 20 in steps of 2: 

>> t = [0:2:20]' 

t= 


Check the number of items in the t array with the length function: 
>> length(t) 
ans = 
11 
Assign values to the parameters: 
>> g = 9.81; m = 68.1; cd = 0.25; 
MATLAB allows you to evaluate a formula such as v = f (t), where the formula is com- 


puted for each value of the t array, and the result is assigned to a corresponding position in 
the v array. For our case, 
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>> v = sqrt(g*m/cd)*tanh(sqrt(g*cd/m)*t) 


y= 


2.5 GRAPHICS 
MATLAB allows graphs to be created quickly and conveniently. For example, to create a 
graph of the t and v arrays from the data above, enter 
>> plot(t, v) 
The graph appears in the graphics window and can be printed or transferred via the clip- 
board to other programs. 


60 


You can customize the graph a bit with commands such as the following: 


>> title('Plot of v versus t') 
>> xlabel('Values of t') 

>> ylabel('Values of v') 

>> grid 
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Plot of v versus t 
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The plot command displays a solid thin blue line by default. If you want to plot each 
point with a symbol, you can include a specifier enclosed in single quotes in the plot 
function. Table 2.3 lists the available specifiers. For example, if you want to use open 
circles enter 


>> plot(t, v, 'o') 


TABLE 2.3 Specifiers for colors, symbols, and line types. 


Colors Symbols Line Types 
Blue b Point : Solid E 
Green g Circle (0) Dotted ; 
Red F X-mark X Dashdot Sy 
Cyan c Plus + Dashed ae 
Magenta m Star * 
Yellow y Square S 
Black k Diamond d 
White w Triangle(down) v 

Triangle(up) A 

Triangle(left) < 

Triangle(right) > 

Pentagram p 

Hexagram h 
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You can also combine several specifiers. For example, if you want to use square green 
markers connected by green dashed lines, you could enter 


>> plot(t, v, 's--g') 


You can also control the line width as well as the marker’s size and its edge and face (i.e., 
interior) colors. For example, the following command uses a heavier (2-point), dashed, cyan 
line to connect larger (10-point) diamond-shaped markers with black edges and magenta faces: 
>> plot(x,y,'--dc','Linehidth', 2,... 
‘MarkerSize',10,... 


'MarkerEdgeColor','k',... 
‘MarkerFaceColor', 'm' ) 


Note that the default line width is 1 point. For the markers, the default size is 6 point with 
blue edge color and no face color. 

MATLAB allows you to display more than one data set on the same plot. For example, 
an alternative way to connect each data marker with a straight line would be to type 

>> plot(t, v, t, v, 'o') 

It should be mentioned that, by default, previous plots are erased every time the plot 
command is implemented. The hold on command holds the current plot and all axis proper- 
ties so that additional graphing commands can be added to the existing plot. The hold off 
command returns to the default mode. For example, if we had typed the following com- 
mands, the final plot would only display symbols: 


>> plot(t, v) 
>> plot(t, v, 'o') 


In contrast, the following commands would result in both lines and symbols being displayed: 


>> plot(t, v) 

>> hold on 

>> plot(t, v, 'o') 

>> hold off 

In addition to hold, another handy function is subplot, which allows you to split the 
graph window into subwindows or panes. It has the syntax 

subplot(m, n, p) 

This command breaks the graph window into an mby-n matrix of small axes, and selects 
the p-th axes for the current plot. 

We can demonstrate subplot by examining MATLAB’s capability to generate three- 
dimensional plots. The simplest manifestation of this capability is the plot3 command 
which has the syntax 

plot3(x, y, Z) 
where x, y, and z are three vectors of the same length. The result is a line in three-dimen- 
sional space through the points whose coordinates are the elements of x, y, and z. 

Plotting a helix provides a nice example to illustrate its utility. First, let’s graph a circle with 
the two-dimensional plot function using the parametric representation: x = sin(f) and y = cos(f). 
We employ the subplot command so we can subsequently add the three-dimensional plot. 

>> t = 0:pi/50:10*pi; 

>> subplot(1,2,1);plot(sin(t),cos(t)) 
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>> axis square 

>> title('(a)') 
As in Fig. 2.1a, the result is a circle. Note that the circle would have been distorted if we 
had not used the axis square command. 

Now, let’s add the helix to the graph’s right pane. To do this, we again employ a para- 
metric representation: x = sin(f), y = cos(t), and z = t 


>> subplot(1,2,2);plot3(sin(t),cos(t),t); 

>> title('(b)') 

The result is shown in Fig. 2.1b. Can you visualize what’s going on? As time evolves, 
the x and y coordinates sketch out the circumference of the circle in the x—y plane in the 
same fashion as the two-dimensional plot. However, simultaneously, the curve rises verti- 
cally as the z coordinate increases linearly with time. The net result is the characteristic 
spring or spiral staircase shape of the helix. 

There are other features of graphics that are useful—for example, plotting objects 
instead of lines, families of curves plots, plotting on the complex plane, log-log or semilog 
plots, three-dimensional mesh plots, and contour plots. As described next, a variety of re- 
sources are available to learn about these as well as other MATLAB capabilities. 
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FIGURE 2.1 
A two-pane plot of (a) a two-dimensional circle and (b) a three-dimensional helix. 
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2.6 OTHER RESOURCES 


The foregoing was designed to focus on those features of MATLAB that we will be using 
in the remainder of this book. As such, it is obviously not a comprehensive overview of all 
of MATLAB’s capabilities. If you are interested in learning more, you should consult one 
of the excellent books devoted to MATLAB (e.g., Attaway, 2009; Palm, 2007; Hanselman 
and Littlefield, 2005; and Moore, 2008). 

Further, the package itself includes an extensive Help facility that can be accessed by 
clicking on the Help menu in the command window. This will provide you with a number 
of different options for exploring and searching through MATLAB’s Help material. In ad- 
dition, it provides access to a number of instructive demos. 

As described in this chapter, help is also available in interactive mode by typing the 
help command followed by the name of a command or function. 

If you do not know the name, you can use the lookfor command to search the MATLAB 
Help files for occurrences of text. For example, suppose that you want to find all the com- 
mands and functions that relate to logarithms, you could enter 


>> lookfor logarithm 


and MATLAB will display all references that include the word logarithm. 

Finally, you can obtain help from The MathWorks, Inc., website at www.mathworks 
.com. There you will find links to product information, newsgroups, books, and technical 
support as well as a variety of other useful resources. 


Prao NED Me EXPLORATORY DATA ANALYSIS 


Background. Your textbooks are filled with formulas developed in the past by re- 
nowned scientists and engineers. Although these are of great utility, engineers and sci- 
entists often must supplement these relationships by collecting and analyzing their own 
data. Sometimes this leads to a new formula. However, prior to arriving at a final predic- 
tive equation, we usually “play” with the data by performing calculations and developing 
plots. In most cases, our intent is to gain insight into the patterns and mechanisms hidden 
in the data. 

In this case study, we will illustrate how MATLAB facilitates such exploratory 
data analysis. We will do this by estimating the drag coefficient of a free-falling human 
based on Eq. (2.1) and the data from Table 2.1. However, beyond merely computing 
the drag coefficient, we will use MATLAB’s graphical capabilities to discern patterns 
in the data. 


Solution. The data from Table 2.1 along with gravitational acceleration can be entered as 


>> m=[83.6 60.2 72.1 91.1 92.9 65.3 80.9]; 
cee Wass S3 A GES) BOGS) Bajo 7/ SA y/o 7) Sul E 
>> g=9.81; 
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2.7 CASE STUDY continued 


The drag coefficients can then be computed with Eq. (2.1). Because we are performing 
element-by-element operations on vectors, we must include periods prior to the operators: 


>> cd=g*m./vt.42 
cd = 
0.2876 0.2511 0.2730 0.2881 0.3125 0.2815 0.3039 
We can now use some of MATLAB’s built-in functions to generate some statistics for the 
results: 
>> cdavg=mean(cd) ,cdmin=min(cd) ,cdmax=max(cd) 
cdavg = 
0.2854 
cdmin = 
0-2511 
cdmax = 
0-3125 


Thus, the average value is 0.2854 with a range from 0.2511 to 0.3125 kg/m. 
Now, let’s start to play with these data by using Eq. (2.1) to make a prediction of the 
terminal velocity based on the average drag: 


>> vpred=sqrt(g*m/cdavg) 
vpred = 
53.6065 45.4897 49.7831 55.9595 56.5096 47.3774 52.7338 

Notice that we do not have to use periods prior to the operators in this formula? Do you 
understand why? 

We can plot these values versus the actual measured terminal velocities. We will also 
superimpose a line indicating exact predictions (the 1:1 line) to help assess the results. 
Because we are going to eventually generate a second plot, we employ the subplot command: 


>> subplot(2,1,1);plot(vt,vpred,'o',vt,vt) 

>> xlabel('measured' ) 

>> ylabel('predicted' ) 

>> title('Plot of predicted versus measured velocities’ ) 


As in the top plot of Fig. 2.2, because the predictions generally follow the 1:1 line, you 
might initially conclude that the average drag coefficient yields decent results. However, 
notice how the model tends to underpredict the low velocities and overpredict the high. 
This suggests that rather than being constant, there might be a trend in the drag coeffi- 
cients. This can be seen by plotting the estimated drag coefficients versus mass: 

>> subplot(2,1,2);plot(m,cd,'o') 

>> xlabel('mass (kg)') 


>> ylabel('estimated drag coefficient (kg/m) ') 
>> title('Plot of drag coefficient versus mass') 


The resulting plot, which is the bottom graph in Fig. 2.2, suggests that rather than being 
constant, the drag coefficient seems to be increasing as the mass of the jumper increases. 
Based on this result, you might conclude that your model needs to be improved. At the 
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2.7 CASE STUDY continued 


Plot of predicted versus measured velocities 
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Two plots created with MATLAB. 


least, it might motivate you to conduct further experiments with a larger number of jumpers 
to confirm your preliminary finding. 

In addition, the result might also stimulate you to go to the fluid mechanics literature 
and learn more about the science of drag. As described previously in Sec. 1.4, you would 
discover that the parameter c4 is actually a lumped drag coefficient that along with the true 
drag includes other factors such as the jumper’s frontal area and air density: 


A 
where Cp = a dimensionless drag coefficient, p = air density (kg/m*), and A = frontal 
area (m°), which is the area projected on a plane normal to the direction of the velocity. 

Assuming that the densities were relatively constant during data collection (a pretty 
good assumption if the jumpers all took off from the same height on the same day), Eq. (2.2) 
suggests that heavier jumpers might have larger areas. This hypothesis could be substanti- 
ated by measuring the frontal areas of individuals of varying masses. 


(2.2) 
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PROBLEMS 


2.1 What is the output when the following commands are 
implemented? 

A=[1:3;2:2:6;3:-1:1] 

A=A' 

A(:,3)=[] 

A=[A(:,1) [4 5 7]' A(:,2)] 

A=sum(diag(A) ) 

2.2 You want to write MATLAB equations to compute a 
vector of y values using the following equations 

6t — 3t-4 


O) = ants 


_6t-4_ 2 
W) =a T2! 


where f is a vector. Make sure that you use periods only 
where necessary so the equation handles vector operations 
properly. Extra periods will be considered incorrect. 

2.3 Write a MATLAB expression to compute and dis- 
play the values of a vector of x values using the following 
equation 


1 y ( aa bz)'8 

~ (=y) 
Assume that y and z are vector quantities of equal length and 
a and b are scalars. 
2.4 What is displayed when the following MATLAB state- 
ments are executed? 
(a) A=[1 2; 3 4; 5 6]; A(2,:)' 
(b) y=[0:1.5:7]' 
(c) a=2; b=8; c=4; a+ b/c 
2.5 The MATLAB humps function defines a curve that has 
2 maxima (peaks) of unequal height over the interval 0 <x < 2, 

1 1 
TO= 034001 E00 

Use MATLAB to generate a plot of f(x) versus x with 

x = [0:1/256:2]; 
Do not use MATLAB’s built-in humps function to generate 
the values of f(x). Also, employ the minimum number of 
periods to perform the vector operations needed to generate 
F(x) values for the plot. 
2.6 Use the linspace function to create vectors identical to 
the following created with colon notation: 
(a) t = 4:6:35 
(b) x= -4:2 
2.7 Use colon notation to create vectors identical to the 
following created with the 1inspace function: 
(a) v= linspace(-2,1.5,8) 
(b) r = linspace(8,4.5,8) 


2.8 The command linspace(a, b, n) generates a row vec- 

tor of n equally spaced points between a and b. Use colon 

notation to write an alternative one-line command to gener- 
ate the same vector. Test your formulation for a = —3, b = 
5,n=6. 

2.9 The following matrix is entered in MATLAB: 
>> A=[3 2 1;0:0.5:1;linspace(6, 8, 3)] 

(a) Write out the resulting matrix. 

(b) Use colon notation to write a single-line MATLAB 
command to multiply the second row by the third col- 
umn and assign the result to the variable c. 

2.10 The following equation can be used to compute values 

of y as a function of x: 


y = be sin(bx) (0.012x* — 0.15x° + 0.075x7 + 2.5x) 


where a and b are parameters. Write the equation for imple- 
mentation with MATLAB, where a = 2, b = 5, and x is a 
vector holding values from 0 to z/2 in increments of Ax = 
7/40. Employ the minimum number of periods (i.e., dot no- 
tation) so that your formulation yields a vector for y. In ad- 
dition, compute the vector z = y? where each element holds 
the square of each element of y. Combine x, y, and z into 
a matrix w, where each column holds one of the variables, 
and display w using the short g format. In addition, gener- 
ate a labeled plot of y and z versus x. Include a legend on 
the plot (use help to understand how to do this). For y, use 
a 1.5-point, dashdotted red line with 14-point, red-edged, 
white-faced pentagram-shaped markers. For z, use a stan- 
dard-sized (i.e., default) solid blue line with standard-sized, 
blue-edged, green-faced square markers. 

2.11 A simple electric circuit consisting of a resistor, a ca- 
pacitor, and an inductor is depicted in Fig. P2.11. The charge 
on the capacitor q(t) as a function of time can be computed 


as 

Jj _(RkY 

zc- (ar) * 
where f = time, q, the initial charge, R = the resistance, L = 
inductance, and C = capacitance. Use MATLAB to generate 
a plot of this function from ¢ = 0 to 0.8, given that g, = 10, 
R= 60, L = 9, and C = 0.00005. 
2.12 The standard normal probability density function is a 
bell-shaped curve that can be represented as 


=Rt/(2L) 


q(t) = qe cos 


1 eel? 


fO= Te 
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Use MATLAB to generate a plot of this function from z = 
—5 to 5. Label the ordinate as frequency and the abscissa as z. 
2.13 If a force F (N) is applied to compress a spring, its 
displacement x (m) can often be modeled by Hooke’s law: 


F = kx 


where k = the spring constant (N/m). The potential energy 
stored in the spring U (J) can then be computed as 


Five springs are tested and the following data compiled: 


F,N 14 18 8 9 13 
x,m 0.013 0.020 0.009 0.010 0.012 


Use MATLAB to store F and x as vectors and then compute 
vectors of the spring constants and the potential energies. 
Use the max function to determine the maximum potential 
energy. 

2.14 The density of freshwater can be computed as a func- 
tion of temperature with the following cubic equation: 


p = 5.5289 x 10°72 — 8.5016 x 10T? 


+ 6.5622 x 10°T, + 0.99987 


where p = density (g/cm°) and Tẹ = temperature (°C). Use 
MATLAB to generate a vector of temperatures ranging from 
32 °F to 93.2 °F using increments of 3.6 °F. Convert this 
vector to degrees Celsius and then compute a vector of den- 
sities based on the cubic formula. Create a plot of p versus 
Tc. Recall that To = 5/9(T;, — 32). 
2.15 Manning’s equation can be used to compute the veloc- 
ity of water in a rectangular open channel: 

VS 


u=% 


BH 2/3 
B+ 7 


where U = velocity (m/s), S = channel slope, n = roughness 
coefficient, B = width (m), and H = depth (m). The follow- 
ing data are available for five channels: 


n S B H 
0.035 0.0001 10 2 
0.020 0.0002 8 1 
0.015 0.0010 20 1.5 
0.030 0.0007 24 3 
0.022 0.0003 15 2.5 


Store these values in a matrix where each row represents one 
of the channels and each column represents one of the param- 
eters. Write a single-line MATLAB statement to compute a 
column vector containing the velocities based on the values 
in the parameter matrix. 

2.16 It is general practice in engineering and science that 
equations be plotted as lines and discrete data as symbols. 
Here are some data for concentration (c) versus time (ft) for 
the photodegradation of aqueous bromine: 


These data can be described by the following function: 


c= 4.84e2 4 


Use MATLAB to create a plot displaying both the data 
(using diamond-shaped, filled-red symbols) and the func- 
tion (using a green, dashed line). Plot the function for t = 
0 to 70 min. 

2.17 The semilogy function operates in an identical fashion 
to the plot function except that a logarithmic (base-10) scale 
is used for the y axis. Use this function to plot the data and 
function as described in Prob. 2.16. Explain the results. 
2.18 Here are some wind tunnel data for force (F) versus 
velocity (v): 


10 20 30 40 50 60 70 80 
25 70 380 550 610 1220 830 1450 


vV, m/s 
F,N 


These data can be described by the following function: 


F =0.2741p!°8? 
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Use MATLAB to create a plot displaying both the data (using 
circular magenta symbols) and the function (using a black 
dash-dotted line). Plot the function for v = 0 to 100 m/s and 
label the plot’s axes. 

2.19 The loglog function operates in an identical fash- 
ion to the plot function except that logarithmic scales are 
used for both the x and y axes. Use this function to plot 
the data and function as described in Prob. 2.18. Explain 
the results. 

2.20 The Maclaurin series expansion for the cosine is 


Use MATLAB to create a plot of the cosine (solid line) 
along with a plot of the series expansion (black dashed line) 
up to and including the term x°/8!. Use the built-in func- 
tion factorial in computing the series expansion. Make the 
range of the abscissa from x = 0 to 3z/2. 

2.21 You contact the jumpers used to generate the data 
in Table 2.1 and measure their frontal areas. The result- 
ing values, which are ordered in the same sequence as the 
corresponding values in Table 2.1, are 


A,m? 0.455 0.402 0.452 0.486 0.531 0.475 0.487 


(a) If the air density is p = 1.223 kg/m*, use MATLAB to 
compute values of the dimensionless drag coefficient Cp. 

(b) Determine the average, minimum, and maximum of the 
resulting values. 

(c) Develop a stacked plot of A versus m (upper) and Cp 
versus m (lower). Include descriptive axis labels and 
titles on the plots. 

2.22 The following parametric equations generate a conical 

helix. 


x = t cos(6f) 
y = t sin(6f) 
z=t 


Compute values of x, y, and z for t = 0 to 6a with At = 2/64. 
Use subplot to generate a two-dimensional line plot (red 
solid line) of (x, y) in the top pane and a three-dimensional 
line plot (cyan solid line) of (x, y, z) in the bottom pane. 
Label the axes for both plots. 

2.23 Exactly what will be displayed after the following 
MATLAB commands are typed? 


(a) >> x = 5; 
>> x A 3; 
>> y=8-x 


(b) >> q = 4:2:12; 
>> r= [7 8 4; 36-5]; 
>> sum(q) * r(2, 3) 
2.24 The trajectory of an object can be modeled as 


y = (tan a)x-— 8 7 X +y 
2v cos" Oy 
where y = height (m), 0) = initial angle (radians), x = 


horizontal distance (m), g = gravitational acceleration 
(= 9.81 m/s’), vy = initial velocity (m/s), and yọ = initial 
height. Use MATLAB to find the trajectories for yọ = 0 and 
Vo = 28 m/s for initial angles ranging from 15 to 75° in incre- 
ments of 15°. Employ a range of horizontal distances from 
x = 0 to 80 m in increments of 5 m. The results should be as- 
sembled in an array where the first dimension (rows) corre- 
sponds to the distances, and the second dimension (columns) 
corresponds to the different initial angles. Use this matrix 
to generate a single plot of the heights versus horizontal 
distances for each of the initial angles. Employ a legend to 
distinguish among the different cases, and scale the plot so 
that the minimum height is zero using the axis command. 
2.25 The temperature dependence of chemical reactions can 
be computed with the Arrhenius equation: 


k= Ae =/&ta) 


where k = reaction rate (s~'), A = the preexponential (or fre- 
quency) factor, E = activation energy (J/mol), R = gas con- 
stant [8.314 J/(mole - K)], and T, = absolute temperature 
(K). A compound has E = 1 x 10° J/mol and A = 7 x 10". 
Use MATLAB to generate values of reaction rates for 
temperatures ranging from 253 to 325 K. Use subplot to gen- 
erate a side-by-side graph of (a) k versus T, (green line) and 
(b) log), k (red line) versus 1/7. Employ the semi logy func- 
tion to create (b). Include axis labels and titles for both sub- 
plots. Interpret your results. 

2.26 Figure P2.26a shows a uniform beam subject to a lin- 
early increasing distributed load. As depicted in Fig. P2.26b, 
deflection y (m) can be computed with 


Ly) 


y= un (ata = 
where E = the modulus of elasticity and Z = the moment of 
inertia (m*). Employ this equation and calculus to generate 
MATLAB plots of the following quantities versus distance 
along the beam: 

(a) displacement (y), 

(b) slope [0 (x) = dy/dx], 

(c) moment [M(x) = Eld yldx’], 

(d) shear [V(x) = Eld? yldx"}, and 

(e) loading [w(x) = —Eld* y/dx*}. 
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FIGURE P2.26 


Use the following parameters for your computation: L = 
600 cm, E = 50,000 kN/cm?, J = 30,000 cm, Wy = 2.5 kN/cm, 
and Ax = 10 cm. Employ the subplot function to display all 
the plots vertically on the same page in the order (a) to (e). 


Include labels and use consistent MKS units when developing 
the plots. 

2.27 The butterfly curve is given by the following paramet- 
ric equations: 


x = sin(t) (e! — 2 cos 4t — sin? 5) 


y =cos(t) Gi — 2 cos 4t — sin’ al 


Generate values of x and y for values of t from 0 to 100 with 
At = 1/16. Construct plots of (a) x and y versus ¢ and (b) 
y versus x. Use subplot to stack these plots vertically and 
make the plot in (b) square. Include titles and axis labels on 
both plots and a legend for (a). For (a), employ a dotted line 
for y in order to distinguish it from x. 

2.28 The butterfly curve from Prob. 2.27 can also be repre- 
sented in polar coordinates as 


Generate values of r for values of 0 from 0 to 82 with 
A@ = 2/32. Use the MATLAB function polar to generate 
the polar plot of the butterfly curve with a dashed red line. 
Employ the MATLAB Help to understand how to generate 
the plot. 


Programming with MATLAB 


CHAPTER OBJECTIVES 


The primary objective of this chapter is to learn how to write M-file programs to 
implement numerical methods. Specific objectives and topics covered are 


Learning how to create well-documented M-files in the edit window and invoke 
them from the command window. 

Understanding how script and function files differ. 

Understanding how to incorporate help comments in functions. 

Knowing how to set up M-files so that they interactively prompt users for 
information and display results in the command window. 

Understanding the role of subfunctions and how they are accessed. 

Knowing how to create and retrieve data files. 

Learning how to write clear and well-documented M-files by employing 
structured programming constructs to implement logic and repetition. 
Recognizing the difference between if...elseif and switch constructs. 
Recognizing the difference between for...end and while structures. 

Knowing how to animate MATLAB plots. 

Understanding what is meant by vectorization and why it is beneficial. 
Understanding how anonymous functions can be employed to pass functions to 
function function M-files. 


YOU’VE GOT A PROBLEM 


n Chap. 1, we used a force balance to develop a mathematical model to predict the 
fall velocity of a bungee jumper. This model took the form of the following differential 
equation: 
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We also learned that a numerical solution of this equation could be obtained with Euler’s 
method: 


dv; 

Dig, = V; + Ti At 

This equation can be implemented repeatedly to compute velocity as a function of 
time. However, to obtain good accuracy, many small steps must be taken. This would be 
extremely laborious and time consuming to implement by hand. However, with the aid of 
MATLAB, such calculations can be performed easily. 

So our problem now is to figure out how to do this. This chapter will introduce you to 
how MATLAB M.-files can be used to obtain such solutions. 


M-FILES 


The most common way to operate MATLAB is by entering commands one at a time in 
the command window. M-files provide an alternative way of performing operations that 
greatly expand MATLAB’s problem-solving capabilities. An M-file consists of a series of 
statements that can be run all at once. Note that the nomenclature “M-file” comes from the 
fact that such files are stored with a .m extension. M-files come in two flavors: script files 
and function files. 


3.1.1 Script Files 


A script file is merely a series of MATLAB commands that are saved on a file. They are 
useful for retaining a series of commands that you want to execute on more than one oc- 
casion. The script can be executed by typing the file name in the command window or by 
pressing the Run button. 


Script File 
Problem Statement. Develop a script file to compute the velocity of the free-falling bun- 


gee jumper for the case where the initial velocity is zero. 


Solution. Open the editor with the selection: New, Script. Type in the following state- 
ments to compute the velocity of the free-falling bungee jumper at a specific time [recall 


Eq. (1.9)]: 
g = 9.81; m = 68.1; t = 12; cd = 0.25; 
v = sqrt(g * m / cd) * tanh(sqrt(g * cd / m) * t) 


Save the file as scriptdemo.m. Return to the command window and type 
>>scriptdemo 
The result will be displayed as 


V = 
50.6175 


Thus, the script executes just as if you had typed each of its lines in the command window. 
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As a final step, determine the value of g by typing 


>> g 


g= 
9.8100 


So you can see that even though g was defined within the script, it retains its value back 
in the command workspace. As we will see in the following section, this is an important 
distinction between scripts and functions. 


EXAMPLE 3.2 


3.1.2 Function Files 


Function files are M-files that start with the word function. In contrast to script files, they 
can accept input arguments and return outputs. Hence they are analogous to user-defined 
functions in programming languages such as Fortran, Visual Basic or C. 

The syntax for the function file can be represented generally as 


function outvar = funcname(arglist) 
% helpcomments 

statements 

outvar = value; 


where outvar = the name of the output variable, funcname = the function’s name, arglist= the 
function’s argument list (i.e., comma-delimited values that are passed into the function), 
helpcomments = text that provides the user with information regarding the function (these can 
be invoked by typing Help funcname in the command window), and statements = MATLAB 
statements that compute the value that is assigned to outvar. 

Beyond its role in describing the function, the first line of the helpcomments, called the 
H] line, is the line that is searched by the lookfor command (recall Sec. 2.6). Thus, you 
should include key descriptive words related to the file on this line. 

The M-file should be saved as funcname.m. The function can then be run by typing 
funcname in the command window as illustrated in the following example. Note that even 
though MATLAB is case-sensitive, your computer’s operating system may not be. Whereas 
MATLAB would treat function names like freefall and FreeFall as two different variables, 
your operating system might not. 


Function File 


Problem Statement. As in Example 3.1, compute the velocity of the free-falling bungee 
jumper but now use a function file for the task. 


Solution. Type the following statements in the file editor: 


function v = freefall(t, m, cd) 

% freefall: bungee velocity with second-order drag 

% v=freefall(t,m,cd) computes the free-fall velocity 

% of an object with second-order drag 
% input: 
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% t= time (s) 

% m= mass (kg) 

% cd = second-order drag coefficient (kg/m) 

% output: 

%  \ = downward velocity (m/s) 

g = 9.81; % acceleration of gravity 

v = sqrt(g * m / cd)*tanh(sqrt(g * cd / m) * t); 


Save the file as freefall.m. To invoke the function, return to the command window and 
type in 
>> freefall(12,68.1,0.25) 


The result will be displayed as 


ans = 
50.6175 


One advantage of a function M-file is that it can be invoked repeatedly for different 
argument values. Suppose that you wanted to compute the velocity of a 100-kg jumper 
after 8 s: 


>> freefall(8,100,0.25) 


ans = 
53.1878 


To invoke the help comments type 
>> help freefall 
which results in the comments being displayed 


freefall: bungee velocity with second-order drag 

v=freefall(t,m,cd) computes the free-fall velocity 

of an object with second-order drag 

input: 

t = time (s) 

m = mass (kg) 

cd = second-order drag coefficient (kg/m) 
output: 

v = downward velocity (m/s) 


If at a later date, you forgot the name of this function, but remembered that it involved 
bungee jumping, you could enter 


>> lookfor bungee 


and the following information would be displayed 


freefall.m - bungee velocity with second-order drag 


Note that, at the end of the previous example, if we had typed 


>> g 
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the following message would have been displayed 
??? Undefined function or variable 'g'. 


So even though g had a value of 9.81 within the M-file, it would not have a value in the 
command workspace. As noted previously at the end of Example 3.1, this is an important 
distinction between functions and scripts. The variables within a function are said to be 
local and are erased after the function is executed. In contrast, the variables in a script 
retain their existence after the script is executed. 

Function M-files can return more than one result. In such cases, the variables 
containing the results are comma-delimited and enclosed in brackets. For example, 
the following function, stats.m, computes the mean and the standard deviation of 
a vector: 


function [mean, stdev] = stats(x) 

n = length(x); 

mean = sum(x)/n; 

stdev = sqrt(sum((x-mean) .42/(n-1))); 


Here is an example of how it can be applied: 


>> y = [8 5 10 12 6 7.5 4]; 
>> [m,s] = stats(y) 


m= 
7.5000 


s = 
2.8137 


Although we will also make use of script M-files, function M-files will be our primary 
programming tool for the remainder of this book. Hence, we will often refer to function 
M-files as simply M-files. 


3.1.3 Variable Scope 


MATLAB variables have a property known as scope that refers to the context of the com- 
puting environment in which the variable has a unique identity and value. Typically, a 
variable’s scope is limited either to the MATLAB workspace or within a function. This 
principle prevents errors when a programmer unintentionally gives the same name to vari- 
ables in different contexts. 

Any variables defined through the command line are within the MATLAB workspace 
and you can readily inspect a workspace variable’s value by entering its name at the com- 
mand line. However, workspace variables are not directly accessible to functions but rather 
are passed to functions via their arguments. For example, here is a function that adds two 
numbers 


function c = adder(a,b) 
x = 88 

a 

c=a+b 


58 


PROGRAMMING WITH MATLAB 


Suppose in the command window we type 
>> x=1,y=4,c=8 


c= 
8 


So as expected, the value of c in the workspace is 8. If you type 
>> d = adder (x,y) 


the result will be 


x= 
88 

as 
1 

c= 
5 

d= 
5 


But, if you then type 
>> cC, X, a 
The result is 


c= 


Undefined function or variable 'a'. 
Error in ScopeScript (line 6) 
c, X,a 


The point here is that even though x was assigned a new value inside the function, the 
variable of the same name in the MATLAB workspace is unchanged. Even though they 
have the same name, the scope of each is limited to their context and does not overlap. In 
the function, the variables a and b are limited in scope to that function, and only exist while 
that function is being executed. Such variables are formally called local variables. Thus, 
when we try to display the value of ain the workspace, an error message is generated be- 
cause the workspace has no access to the a in the function. 

Another obvious consequence of limited-scope variables is that any parameter needed 
by a function must be passed as an input argument or by some other explicit means. A func- 
tion cannot otherwise access variables in the workspace or in other functions. 


3.1.4 Global Variables 


As we have just illustrated, the function’s argument list is like a window through which 
information is selectively passed between the workspace and a function, or between 
two functions. Sometimes, however, it might be convenient to have access to a vari- 
able in several contexts without passing it as an argument. In such cases, this can be 
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accomplished by defining the variable as global. This is done with the global command, 
which is defined as 


global X Y Z 


where X, Y, and Z are global in scope. If several functions (and possibly the workspace), all 
declare a particular name as global, then they all share a single value of that variable. Any 
change to that variable, in any function, is then made to all the other functions that declare 
it global. Stylistically, MATLAB recommends that global variables use all capital letters, 
but this is not required. 


Use of Global Variables 


Problem Statement. The Stefan-Boltzmann law is used to compute the radiation flux 
from a black body’ as in 


J= 


where J = radiation flux [W/(m? s)], o = the Stefan-Boltzmann constant (5.670367 x 
1078 W m~ K~, and T, = absolute temperature (K). In assessing the impact of climate 
change on water temperature, it is used to compute the radiation terms in a waterbody’s 
heat balance. For example, the long-wave radiation from the atmosphere to the waterbody, 
J „ (W/(m? s)], can be calculated as 


an 


air 


Jan = 0.9TO(T yi, + 273.15)" (0.6 + 0.031 1/37) 


where T, = the temperature of the air above the waterbody (°C), and e,,, = the vapor pres- 
sure of the air above the water body (mmHg), 
17.277, 


Cain = 4.5960 +T (3.3.1) 


where T} = the dew-point temperature (°C). The radiation from the water surface back into 
the atmosphere, J,,, [W/(m? s)], is calculated as 


J, = 0.970(T,, + 273.15)* (E3.3.2) 


where T, = the water temperature (°C). Write a script that utilizes two functions to com- 
pute the net long-wave radiation (i.e., the difference between the atmospheric radiation in 
and the water back radiation out) for a cold lake with a surface temperature of T, = 15 °C 
on a hot (T; = 30 °C), humid (T, = 27.7 °C) summer day. Use global to share the Stefan- 


Boltzmann constant between the script and functions. 


Solution. Here is the script 


clc, format compact 

global SIGMA 

SIGMA =5.670367e-8; 

Tair = 30; Tw = 15; Td = 27.7; 


1A black body is an object that absorbs all incident electromagnetic radiation, regardless of frequency or angle 
of incidence. A white body is one that reflects all incident rays completely and uniformly in all directions. 
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Jan = AtmLonghlaveRad(Tair, Td) 
Jbr = WaterBackRad(Tw) 
JnetLonglave=Jan-Jbr 


Here is a function to compute the incoming long wave radiation from the atmosphere into 
the lake 


function Ja = AtmLongWaveRad(Tair, Td) 

global SIGMA 

eair=4.596*exp(17.27*Td/(237.3+Td)); 

Ja=0.97*SIGMA* (Tair+273.15)44*(0.6+0.031*sqrt(eair)); 
end 


and here is a function to compute the long wave radiation from the lake back into the 
atmosphere 


function Jb = WaterBackRad(Twater ) 
global SIGMA 

Jb=0.97*SIGMA* (Twater+273.15)*4; 
end 


When the script is run, the output is 


Jan = 
354.8483 
Jbr = 
379.1905 
JnetLonghave = 
-24.3421 


Thus, for this case, because the back radiation is larger than the incoming radiation, so the 
lake loses heat at the rate of 24.3421 W/(m’? s) due to the two long-wave radiation fluxes. 


If you require additional information about global variables, you can always type help 
global at the command prompt. The help facility can also be invoked to learn about other 
MATLAB commands dealing with scope such as persistent. 


3.1.5 Subfunctions 


Functions can call other functions. Although such functions can exist as separate M-files, 
they may also be contained in a single M-file. For example, the M-file in Example 3.2 
(without comments) could have been split into two functions and saved as a single M-file’: 


function v = freefallsubfunc(t, m, cd) 
v = vel(t, m, cd); 
end 


function v = vel(t, m, cd) 


g = 9.81; 
v = sqrt(g * m / cd)*tanh(sqrt(g * cd / m) * t); 
end 


?Note that although end statements are not used to terminate single-function M-files, they are included when 
subfunctions are involved to demarcate the boundaries between the main function and the subfunctions. 
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3.2 


This M-file would be saved as freefallsubfunc.m. In such cases, the first function is called 
the main or primary function. It is the only function that is accessible to the command win- 
dow and other functions and scripts. All the other functions (in this case, vel) are referred 
to as subfunctions. 

A subfunction is only accessible to the main function and other subfunctions within 
the M-file in which it resides. If we run freefallsubfunc from the command window, the 
result is identical to Example 3.2: 


>> freefallsubfunc(12,68.1,0.25) 


ans = 
50.6175 


However, if we attempt to run the subfunction vel, an error message occurs: 


>> vel(12,68.1, .25) 
??? Undefined function or method 'vel' for input arguments of type ‘double’. 


INPUT-OUTPUT 


As in Sec.3.1, information is passed into the function via the argument list and is output via 
the function’s name. Two other functions provide ways to enter and display information 
directly using the command window. 


The input Function. This function allows you to prompt the user for values directly from 
the command window. Its syntax is 


n = input('promptstring' ) 


The function displays the promptstring, waits for keyboard input, and then returns the value 
from the keyboard. For example, 


m = input('Mass (kg): ') 
When this line is executed, the user is prompted with the message 
Mass (kg): 


If the user enters a value, it would then be assigned to the variable m. 
The input function can also return user input as a string. To do this, an 's' is appended 
to the function’s argument list. For example, 


name = input('Enter your name: ','s') 


The disp Function. This function provides a handy way to display a value. Its syntax is 
disp( value) 


where value = the value you would like to display. It can be a numeric constant or vari- 
able, or a string message enclosed in hyphens. Its application is illustrated in the following 
example. 
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EXAMPLE 3.4 


An Interactive M-File Function 


Problem Statement. As in Example 3.2, compute the velocity of the free-falling bungee 
jumper, but now use the input and disp functions for input/output. 


Solution. Type the following statements in the file editor: 


function freefalli 
% freefalli: interactive bungee velocity 
%  freefalli interactive computation of the 


% free-fall velocity of an object 
% with second-order drag. 
g = 9.81; % acceleration of gravity 


m = input('Mass (kg): '); 

cd = input('Drag coefficient (kg/m): '); 

t = input('Time (s): '); 

disp(' ') 

disp(‘Velocity (m/s):') 

disp(sqrt(g * m / cd)*tanh(sqrt(g * cd / m) * t)) 


Save the file as freefalli.m. To invoke the function, return to the command window 
and type 
>> freefalli 


Mass (kg): 68.1 
Drag coefficient (kg/m): 0.25 
Time (s): 12 


Velocity (m/s): 
50.6175 


The fprintf Function. This function provides additional control over the display of 
information. A simple representation of its syntax is 


fprintf('format', x, ...) 


where format is a string specifying how you want the value of the variable x to be displayed. 
The operation of this function is best illustrated by examples. 

A simple example would be to display a value along with a message. For instance, 
suppose that the variable velocity has a value of 50.6175. To display the value using eight 
digits with four digits to the right of the decimal point along with a message, the statement 
along with the resulting output would be 


>> fprintf('The velocity is %8.4f m/s\n', velocity) 
The velocity is 50.6175 m/s 


This example should make it clear how the format string works. MATLAB starts at the 
left end of the string and displays the labels until it detects one of the symbols: % or \. In our 
example, it first encounters a % and recognizes that the following text is a format code. As 
in Table 3.1, the format codes allow you to specify whether numeric values are displayed 
in integer, decimal, or scientific format. After displaying the value of velocity, MATLAB 
continues displaying the character information (in our case the units: m/s) until it detects the 
symbol \. This tells MATLAB that the following text is a control code. As in Table 3.1, the 
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TABLE 3.1 Commonly used format and control codes employed 
with the fprintf function. 


Format Code Description 

%d Integer format 

%e Scientific format with lowercase e 
%E Scientific format with uppercase E 
%f Decimal format 

%g The more compact of %e or %f 
Control Code Description 

\n Start new line 

\t Tab 


control codes provide a means to perform actions such as skipping to the next line. If we 
had omitted the code \n in the previous example, the command prompt would appear at the 
end of the label m/s rather than on the next line as would typically be desired. 

The fprintf function can also be used to display several values per line with different 
formats. For example, 


>> fprintf('%5d %10.3f %8.5e\n',100,2*pi,pi); 
100 6.283 3.14159e+000 


It can also be used to display vectors and matrices. Here is an M-file that enters two 
sets of values as vectors. These vectors are then combined into a matrix, which is then 
displayed as a table with headings: 


function fprintfdemo 


x=[12345]; 

y = [20.4 12.6 17.8 88.7 120.4]; 
z= [x;y]; 

fprintf(' x y\n'); 


fprintf('%5d %10.3f\n',z); 
The result of running this M-file is 


>> fprintfdemo 


x y 
1 20.400 
2 12.600 
3 17.800 
4 88.700 
5 120.400 


3.2.1 Creating and Accessing Files 


MATLAB has the capability to both read and write data files. The simplest approach in- 
volves a special type of binary file, called a MAT-file, which is expressly designed for 
implementation within MATLAB. Such files are created and accessed with the save and 
load commands. 


64 


PROGRAMMING WITH MATLAB 


The save command can be used to generate a MAT-file holding either the entire work- 
space or a few selected variables. A simple representation of its syntax is 


save filename varl var2 ... varn 
This command creates a MAT-file named fi 7ename.mat that holds the variables var1 through 


varn. If the variables are omitted, all the workspace variables are saved. The load command 
can subsequently be used to retrieve the file: 


load filename varl var2 ... varn 
which retrieves the variables var1 through varn from filename.mat. As was the case with 
save, if the variables are omitted, all the variables are retrieved. 


For example, suppose that you use Eq. (1.9) to generate velocities for a set of drag 
coefficients: 


>> g=9.81;m=80;t=5; 
>> cd=[.25 .267 .245 .28 .273]'; 
>> v=sqrt(g*m ./cd).*tanh(sqrt(g*cd/m)*t) ; 
You can then create a file holding the values of the drag coefficients and the velocities with 


>> save veldrag v cd 


To illustrate how the values can be retrieved at a later time, remove all variables from 
the workspace with the clear command, 


>> clear 
At this point, if you tried to display the velocities you would get the result: 


>> Vv 
??? Undefined function or variable 'v'. 


However, you can recover them by entering 
>> load veldrag 
Now, the velocities are available as can be verified by typing 


>> who 


Your variables are: 
cd v 


Although MAT-files are quite useful when working exclusively within the MATLAB 
environment, a somewhat different approach is required when interfacing MATLAB with 
other programs. In such cases, a simple approach is to create text files written in ASCII 
format. 

ASCII files can be generated in MATLAB by appending -ascii to the save command. 
In contrast to MAT-files where you might want to save the entire workspace, you would 
typically save a single rectangular matrix of values. For example, 


>> A=[5 7 9 2;3 6 3 9]; 
>> save Simpmatrix.txt -ascii 
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3.3 


In this case, the save command stores the values in A in 8-digit ASCII form. If you want 
to store the numbers in double precision, just append -ascii -double. In either case, the file 
can be accessed by other programs such as spreadsheets or word processors. For example, 
if you open this file with a text editor, you will see 


5.0000000e + 000 7 .0000000e + 000 9.0000000e + 000 2.0000000e + 000 
3.0000000e + 000 6.0000000e + 000 3.0000000e + 000 9 .0000000e + 000 


Alternatively, you can read the values back into MATLAB with the load command, 
>> load simpmatrix.txt 


Because simpmatrix.txt is not a MAT-file, MATLAB creates a double precision array 
named after the filename: 


>> simpmatrix 
simpmatrix = 
5 7 9 2 
3 6 3 9 
Alternatively, you could use the load command as a function and assign its values to a 
variable as in 


>> A = load('simpmatrix.txt') 


The foregoing material covers but a small portion of MATLAB’s file management 
capabilities. For example, a handy import wizard can be invoked with the menu selections: 
File, Import Data. As an exercise, you can demonstrate the import wizards convenience 
by using it to open simpmatrix.txt. In addition, you can always consult help to learn more 
about this and other features. 


STRUCTURED PROGRAMMING 


The simplest of all M-files perform instructions sequentially. That is, the program state- 
ments are executed line by line starting at the top of the function and moving down to the 
end. Because a strict sequence is highly limiting, all computer languages include state- 
ments allowing programs to take nonsequential paths. These can be classified as 


e Decisions (or Selection). The branching of flow based on a decision. 
e Loops (or Repetition). The looping of flow to allow statements to be repeated. 


3.3.1 Decisions 


The if Structure. This structure allows you to execute a set of statements if a logical 
condition is true. Its general syntax is 


if condition 
statements 
end 


where condition is a logical expression that is either true or false. For example, here is a 
simple M-file to evaluate whether a grade is passing: 


function grader (grade) 
% grader (grade) : 
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% determines whether grade is passing 
% input: 
% grade = numerical value of grade (0-100) 
% output: 
% displayed message 
if grade >= 60 
disp('passing grade') 
end 


The following illustrates the result 
>> grader (95.6) 
passing grade 


For cases where only one statement is executed, it is often convenient to implement the 
if structure as a single line, 
if grade > 60, disp('passing grade'), end 


This structure is called a single-line if. For cases where more than one statement is imple- 
mented, the multiline if structure is usually preferable because it is easier to read. 


Error Function. A nice example of the utility of a single-line if is to employ it for rudi- 
mentary error trapping. This involves using the error function which has the syntax, 


error(msg) 


When this function is encountered, it displays the text message msg, indicates where the 
error occurred, and causes the M-file to terminate and return to the command window. 

An example of its use would be where we might want to terminate an M-file to avoid 
a division by zero. The following M-file illustrates how this could be done: 


function f = errortest(x) 
if x == 0, error('zero value encountered'), end 
= 1/x; 


If a nonzero argument is used, the division would be implemented successfully as in 


>> errortest(10) 
ans = 
0.1000 
However, for a zero argument, the function would terminate prior to the division and the 
error message would be displayed in red typeface: 


>> errortest(0) 


??? Error using ==> errortest at 2 
zero value encountered 


Logical Conditions. The simplest form of the condition is a single relational expression 
that compares two values as in 


value, relation value, 


where the values can be constants, variables, or expressions and the relation is one of the 
relational operators listed in Table 3.2. 
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TABLE 3.2 Summary of relational operators in MATLAB. 


Example Operator Relationship 

x == == Equal 

unit ~= 'm' ~= Not equal 

a<0 < Less than 

s>t > Greater than 

3.9 <= a/3 <= Less than or equal to 
r>=0 >= Greater than or equal to 


MATLAB also allows testing of more than one logical condition by employing logical 
operators. We will emphasize the following: 


e ~(Not). Used to perform logical negation on an expression. 


~expression 


If the expression is true, the result is false. Conversely, if the expression is false, the 
result is true. 
e &(And). Used to perform a logical conjunction on two expressions. 


expression, & expression, 


If both expressions evaluate to true, the result is true. If either or both expressions 
evaluates to false, the result is false. 
e || (Or). Used to perform a logical disjunction on two expressions. 


expression, || expression, 
If either or both expressions evaluate to true, the result is true. 


Table 3.3 summarizes all possible outcomes for each of these operators. Just as for 
arithmetic operations, there is a priority order for evaluating logical operations. These 
are from highest to lowest: ~, & and ||. In choosing between operators of equal priority, 
MATLAB evaluates them from left to right. Finally, as with arithmetic operators, paren- 
theses can be used to override the priority order. 

Let’s investigate how the computer employs the priorities to evaluate a logical expres- 
sion. If a = -1, b= 2, x = 1, and y = 'b', evaluate whether the following is true or false: 


a*b>O&b==2&x>7 || ~(y > 'd') 
TABLE 3.3 A truth table summarizing the possible outcomes for logical operators 


employed in MATLAB. The order of priority of the operators is shown at 
the top of the table. 


Highest > Lowest 
x y nX x&y xlly 
= = F T - 
T F F R ii 
F T T F T 
F F T F F 
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To make it easier to evaluate, substitute the values for the variables: 
1*2>08&2==28&1>7 || ~('b' > 'd') 
The first thing that MATLAB does is to evaluate any mathematical expressions. In this 
example, there is only one: -1 * 2, 
-—2>0&2==2&1>7 || ~('b' > 'd') 
Next, evaluate all the relational expressions 


-2>0&82==281>7 || ~('b' > 'd') 
F & T & Fill- F 


At this point, the logical operators are evaluated in priority order. Since the ~ has highest 
priority, the last expression (~F) is evaluated first to give 


F&T&F || T 


The & operator is evaluated next. Since there are two, the left-to-right rule is applied and the 
first expression (F & T) is evaluated: 


F&F || T 
The & again has highest priority 
Fed 


Finally, the || is evaluated as true. The entire process is depicted in Fig. 3.1. 


FIGURE 3.1 
A step-by-step evaluation of a complex decision. 
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EXAMPLE 3.5 


The if...else Structure. This structure allows you to execute a set of statements if 
a logical condition is true and to execute a second set if the condition is false. Its general 
syntax is 


if condition 
statements, 
else 
statements, 
end 


The if...elseif Structure. It often happens that the false option of an if. ..else structure 
is another decision. This type of structure often occurs when we have more than two op- 
tions for a particular problem setting. For such cases, a special form of decision structure, 
the if...elseif has been developed. It has the general syntax 


if condition, 
statements, 

elseif condition, 
statements, 

elseif condition, 
statements; 


else 
statements, 
end 


else 


if Structures 


Problem Statement. Fora scalar, the built-in MATLAB sign function returns the sign of 
its argument (—1, 0, 1). Here’s a MATLAB session that illustrates how it works: 


>> sign(25.6) 


ans = 
1 


>> sign(-0.776) 
ans = 
-1 
>> sign(0) 
ans = 
0 


Develop an M-file to perform the same function. 


Solution. First, an if structure can be used to return 1 if the argument is positive: 


function sgn = mysign(x) 
% mysign(x) returns 1 if x is greater than zero. 
if x>0 
sgn = 1; 
end 


70 


PROGRAMMING WITH MATLAB 


This function can be run as 
>> mysign(25.6) 


ans = 
1 


Although the function handles positive numbers correctly, if it is run with a negative 
or zero argument, nothing is displayed. To partially remedy this shortcoming, an if...else 
structure can be used to display -1 if the condition is false: 


function sgn = mysign(x) 
% mysign(x) returns 1 if x is greater than zero. 
% -1 if x is less than or equal to zero. 
if x>0 
sgn = 1; 
else 
sgn = -1; 
end 


This function can be run as 
>> mysign(-0.776) 


ans = 
sl 


Although the positive and negative cases are now handled properly, -1 is erroneously 
returned if a zero argument is used. An if...elseif structure can be used to incorporate this 
final case: 


function sgn = mysign(x) 
% mysign(x) returns 1 if x is greater than zero. 
% -1 if x is less than zero. 
% 0 if x is equal to zero. 
if x>0 

sgn = 1; 
elseif x < 0 

sgn = -1; 
else 

sgn = 0; 
end 


The function now handles all possible cases. For example, 


>> mysign(0) 


ans = 
0 


The switch Structure. The switch structure is similar in spirit to the if...elseif structure. 
However, rather than testing individual conditions, the branching is based on the value of a 
single test expression. Depending on its value, different blocks of code are implemented. In 
addition, an optional block is implemented if the expression takes on none of the prescribed 
values. It has the general syntax 


switch testexpression 
case value, 
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statements, 
case value, 
statements, 


otherwise 
statements, 
end 


otherwise 


As an example, here is function that displays a message depending on the value of the 
string variable, grade. 
grade = 'B'; 
switch grade 
case 'A' 
disp( 'Excellent' ) 
case 'B' 
disp('Good' ) 
case 'C' 
disp( 'Mediocre' ) 
case 'D' 
disp('Whoops' ) 
case 'F' 
disp('Would like fries with your order?') 
otherwise 
disp('Huh! ') 
end 


When this code was executed, the message “Good” would be displayed. 


Variable Argument List. MATLAB allows a variable number of arguments to be passed 
to a function. This feature can come in handy for incorporating default values into your 
functions. A default value is a number that is automatically assigned in the event that the 
user does not pass it to a function. 

As an example, recall that earlier in this chapter, we developed a function freefall, 
which had three arguments: 


v = freefall(t,m,cd) 


Although a user would obviously need to specify the time and mass, they might not have 
a good idea of an appropriate drag coefficient. Therefore, it would be nice to have the pro- 
gram supply a value if they omitted it from the argument list. 

MATLAB has a function called nargin that provides the number of input arguments 
supplied to a function by a user. It can be used in conjunction with decision structures like 
the if or switch constructs to incorporate default values as well as error messages into your 
functions. The following code illustrates how this can be done for freefal1: 

function v = freefall2(t, m, cd) 


% freefall2: bungee velocity with second-order drag 
% w=freefall2(t,m,cd) computes the free-fall velocity 


% of an object with second-order drag. 
% input: 
% t= time (s) 


% m= mass (kg) 
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% cd = drag coefficient (default = 0.27 kg/m) 
% output: 

% v = downward velocity (m/s) 

switch nargin 


case 0 
error('Must enter time and mass') 
case 1 
error('Must enter mass') 
case 2 
cd = 0.27; 
end 
g = 9.81; % acceleration of gravity 
v = sqrt(g * m / cd)*tanh(sqrt(g * cd / m) * t); 


Notice how we have used a switch structure to either display error messages or set the 
default, depending on the number of arguments passed by the user. Here is a command 
window session showing the results: 


>> freefall2(12,68.1,0.25) 


ans = 
50.6175 


>> freefall2(12,68.1) 


ans = 
48.8747 


>> freefall2(12) 


??? Error using ==> freefall2 at 15 
Must enter mass 


>> freefall2() 
??? Error using ==> freefall2 at 13 
Must enter time and mass 


Note that nargin behaves a little differently when it is invoked in the command window. 
In the command window, it must include a string argument specifying the function and it 
returns the number of arguments in the function. For example, 


>> nargin('freefal12') 


ans = 
3 


3.3.2 Loops 


As the name implies, loops perform operations repetitively. There are two types of loops, 
depending on how the repetitions are terminated. A for loop ends after a specified number 
of repetitions. A while loop ends on the basis of a logical condition. 


The for...end Structure. A for loop repeats statements a specific number of times. Its 
general syntax is 


for index = start:step:finish 
statements 
end 
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The for loop operates as follows. The index is a variable that is set at an initial value, start. 
The program then compares the index with a desired final value, finish. If the index is 
less than or equal to the finish, the program executes the statements. When it reaches the 
end line that marks the end of the loop, the index variable is increased by the step and the 
program loops back up to the for statement. The process continues until the index becomes 
greater than the finish value. At this point, the loop terminates as the program skips down 
to the line immediately following the end statement. 

Note that if an increment of 1 is desired (as is often the case), the step can be dropped. 
For example, 

for i = 1:5 

disp(i) 

end 
When this executes, MATLAB would display in succession, 1, 2, 3, 4, 5. In other words, 
the default step is 1. 

The size of the step can be changed from the default of 1 to any other numeric value. 
It does not have to be an integer, nor does it have to be positive. For example, step sizes of 
0.2, -1, or -5, are all acceptable. 

If a negative step is used, the loop will “countdown” in reverse. For such cases, the 
loop’s logic is reversed. Thus, the finish is less than the start and the loop terminates when 
the index is less than the finish. For example, 

for j = 10:-1:1 

disp(j) 

end 
When this executes, MATLAB would display the classic “countdown” sequence: 10, 9, 
8; 7, 6, 5; 4,3, 2; 1, 


Using a for Loop to Compute the Factorial 


Problem Statement. Develop an M-file to compute the factorial.* 


0!=1 
W=1 
2!=1x2=2 


3!=1x2x3=6 
44=1x2x3x4=24 
5!=1x2x3x4x5= 120 


Solution. A simple function to implement this calculation can be developed as 


function fout = factor (n) 

% factor(n): 

% Computes the product of all the integers from 1 to n. 

x=1; 

for i = 1:n 
x=x* j; 

end 

fout = x; 

end 


, 


3 Note that MATLAB has a built-in function factorial that performs this computation. 
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which can be run as 
>> factor(5) 


ans = 
120 


This loop will execute 5 times (from 1 to 5). At the end of the process, x will hold a value 
of 5! (meaning 5 factorial or 1 X 2 x 3 x 4 x 5 = 120). 

Notice what happens if n = 0. For this case, the for loop would not execute, and we 
would get the desired result, 0! = 1. 


Vectorization. The for loop is easy to implement and understand. However, for MATLAB, 
it is not necessarily the most efficient means to repeat statements a specific number of 
times. Because of MATLAB’s ability to operate directly on arrays, vectorization provides 
a much more efficient option. For example, the following for loop structure: 


peii 

for t = 0:0.02:50 
j=i+1; 
y(i) = cos(t); 


end 
can be represented in vectorized form as 


0:0.02:50; 
cos(t); 


t 
y 


It should be noted that for more complex code, it may not be obvious how to vectorize the 
code. That said, wherever possible, vectorization is recommended. 


Preallocation of Memory. MATLAB automatically increases the size of arrays every 
time you add a new element. This can become time consuming when you perform actions 
such as adding new values one at a time within a loop. For example, here is some code that 
sets value of elements of y depending on whether or not values of t are greater than one: 
= 0:.01:5; 
for i = 1:length(t) 
if t(i)>1 
y(i) = 1/t(i); 
else 
y(i) =1; 
end 
end 


For this case, MATLAB must resize y every time a new value is determined. The following 
code preallocates the proper amount of memory by using a vectorized statement to assign 
ones to y prior to entering the loop. 
= 0:.01:5; 
y = ones(size(t)); 
for i = 1:length(t) 
if t(i)>1 
y(i) = 1/t(4); 
end 
end 
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Thus, the array is only sized once. In addition, preallocation helps reduce memory frag- 
mentation, which also enhances efficiency. 


The while Structure. A while loop repeats as long as a logical condition is true. Its general 
syntax is 


while condition 
statements 
end 


The statements between the while and the end are repeated as long as the condition is true. 
A simple example is 
x= 8 
while x > 0 
x= X= 3 
disp(x) 
end 


When this code is run, the result is 


e N U1 oC 


The while...break Structure. Although the while structure is extremely useful, the fact 
that it always exits at the beginning of the structure on a false result is somewhat 
constraining. For this reason, languages such as Fortran 90 and Visual Basic have special 
structures that allow loop termination on a true condition anywhere in the loop. Although 
such structures are currently not available in MATLAB, their functionality can be mim- 
icked by a special version of the while loop. The syntax of this version, called a while... 
break structure, can be written as 
while (1) 
statements 
if condition, break, end 


statements 
end 


where break terminates execution of the loop. Thus, a single line if is used to exit the loop 
if the condition tests true. Note that as shown, the break can be placed in the middle of the 
loop (i.e., with statements before and after it). Such a structure is called a midtest loop. 
If the problem required it, we could place the break at the very beginning to create a 
pretest loop. An example is 
while (1) 
If x < 0, break, end 
x=x-5; 
end 
Notice how 5 is subtracted from x on each iteration. This represents a mechanism so that the 
loop eventually terminates. Every decision loop must have such a mechanism. Otherwise 
it would become a so-called infinite loop that would never stop. 
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Alternatively, we could also place the if...break statement at the very end and create 
a posttest loop, 


while (1) 

x=x-5; 

if x < 0, break, end 
end 


It should be clear that, in fact, all three structures are really the same. That is, de- 
pending on where we put the exit (beginning, middle, or end) dictates whether we have a 
pre-, mid- or posttest. It is this simplicity that led the computer scientists who developed 
Fortran 90 and Visual Basic to favor this structure over other forms of the decision loop 
such as the conventional while structure. 


The pause Command. There are often times when you might want a program to temporar- 
ily halt. The command pause causes a procedure to stop and wait until any key is hit. A nice 
example involves creating a sequence of plots that a user might want to leisurely peruse 
before moving on to the next. The following code employs a for loop to create a sequence 
of interesting plots that can be viewed in this manner: 


for n = 3:10 
mesh (magic (n)) 
pause 

end 


The pause can also be formulated as pause(n), in which case the procedure will halt 
for n seconds. This feature can be demonstrated by implementing it in conjunction with 
several other useful MATLAB functions. The beep command causes the computer to emit 
a beep sound. Two other functions, tic and toc, work together to measure elapsed time. 
The tic command saves the current time that toc later employs to display the elapsed time. 
The following code then confirms that pause(n) works as advertised complete with sound 
effects: 

tic 

beep 

pause (5) 

beep 

toc 
When this code is run, the computer will beep. Five seconds later it will beep again and 
display the following message: 


Elapsed time is 5.006306 seconds. 


By the way, if you ever have the urge to use the command pause(inf), MATLAB will 
go into an infinite loop. In such cases, you can return to the command prompt by typing 
Ctrl+c or Ctrl+Break. 

Although the foregoing examples might seem a tad frivolous, the commands can be 
quite useful. For instance, tic and toc can be employed to identify the parts of an algorithm 
that consume the most execution time. Further, the Ctrl+c or Ctrl+Break key combina- 
tions come in real handy in the event that you inadvertently create an infinite loop in one 
of your M-files. 
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EXAMPLE 3.7 


3.3.3 Animation 


There are two simple ways to animate a plot in MATLAB. First, if the computations are 
sufficiently quick, the standard plot function can be employed in a way that the animation 
can appear smooth. Here is a code fragment that indicates how a for loop and standard 
plotting functions can be employed to animate a plot, 


% create animation with standard plot functions 


for j=1:n 
plot commands 
end 


Thus, because we do not include hold on, the plot will refresh on each loop iteration. Through 
judicious use of axis commands, this can result in a smoothly changing image. 

Second, there are special functions, getframe and movie, that allow you to capture a 
sequence of plots and then play them back. As the name implies, the getframe function 
captures a snapshot (pixmap) of the current axes or figure. It is usually used in a for loop 
to assemble an array of movie frames for later playback with the movie function, which has 
the following syntax: 


movie (m,n, fps) 


where m = the vector or matrix holding the sequence of frames constituting the movie, 
n= an optional variable specifying how many times the movie is to be repeated (if it is 
omitted, the movie plays once), and fps = an optional variable that specifies the movie’s 
frame rate (if it is omitted, the default is 12 frames per second). Here is a code fragment 
that indicates how a for loop along with the two functions can be employed to create 
a movie, 


% create animation with getframe and movie 
for j=1:n 
plot commands 
M(j) = getframe; 
end 
movie(M) 


Each time the loop executes, the plot commands create an updated version of a plot, which 
is then stored in the vector M. After the loop terminates, the n images are then played back 
by movie. 


Animation of Projectile Motion 

Problem Statement. In the absence of air resistance, the Cartesian coordinates of a pro- 
jectile launched with an initial velocity (v,) and angle (@)) can be computed with 

x = Vo COS(A)t 

y = Dg sin(O,)t — 0.58 


where g = 9.81 m/s”. Develop a script to generate an animated plot of the projectile’s 
trajectory given that vy = 5 m/s and @ = 45°. 
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Solution. A script to generate the animation can be written as 
clc,clf,clear 
g=9.81; theta0=45*pi/180; v0=5; 
t(1)=0;x=0;y=0; 
plot(x,y,'o','MarkerFaceColor','b', 'MarkerSize' ,8) 
axis([0 3 0 0.8]) 
M(1)=getframe; 
dt=1/128; 
for j = 2:1000 
t(j)=t(j-1)+dt; 
x=v0*cos(theta0)*t(j); 
y=v0*sin(theta0)*t(j)-0.5*g*t(j)*2; 
plot(x,y,'o', 'MarkerFaceColor','b', 'MarkerSize' ,8) 
axis([0 3 0 0.8]) 
M(j) =getframe; 
if y<=0, break, end 
end 
pause 
movie(M,1) 

Several features of this script bear mention. First, notice that we have fixed the ranges 
for the x and y axes. If this is not done, the axes will rescale and cause the animation to jump 
around. Second, we terminate the for loop when the projectile’s height y falls below zero. 

When the script is executed, two animations will be displayed (we’ve placed a pause 
between them). The first corresponds to the sequential generation of the frames within the 
loop, and the second corresponds to the actual movie. Although we cannot show the results 
here, the trajectory for both cases will look like Fig. 3.2. You should enter and run the 
foregoing script in MATLAB to see the actual animation. 


FIGURE 3.2 
Plot of a projectile’s trajectory. 
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EXAMPLE 3.8 


NESTING AND INDENTATION 


We need to understand that structures can be “nested” within each other. Nesting refers 
to placing structures within other structures. The following example illustrates the concept. 


Nesting Structures 


Problem Statement. The roots of a quadratic equation 
f@=axr+bxt+e 


can be determined with the quadratic formula 


"E —b + Vb? — 4ac 
2a 
Develop a function to implement this formula given values of the coefficients. 


Solution. Top-down design provides a nice approach for designing an algorithm to compute 
the roots. This involves developing the general structure without details and then refining 
the algorithm. To start, we first recognize that depending on whether the parameter a is 
zero, we will either have “special” cases (e.g., single roots or trivial values) or conventional 
cases using the quadratic formula. This “big-picture” version can be programmed as 


function quadroots(a, b, c) 

% quadroots: roots of quadratic equation 

%  quadroots(a,b,c): real and complex roots 
% of quadratic equation 
% input: 

% a= second-order coefficient 

% b= first-order coefficient 

% c= zero-order coefficient 

% output: 

% rl = real part of first root 


% il = imaginary part of first root 
% 2 = real part of second root 
% i2 = imaginary part of second root 
if a == 

*special cases 
else 

%quadratic formula 
end 


Next, we develop refined code to handle the “special” cases: 


%special cases 


if b~=0 
%single root 
rl=-c/b 
else 


%trivial solution 
disp('Trivial solution. Try again') 
end 


And we can develop refined code to handle the quadratic formula cases: 


%quadratic formula 
d=b*%2-4*a*C; 
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if d >=0 
%real roots 
r1 = (-b + sqrt(d)) / (2 * a) 
r2 = (-b - sqrt(d)) / (2 * a) 


else 
%complex roots 
r1 = -b / (2 * a) 


il = sqrt(abs(d)) / (2 * a) 
r2=rl 
i2 = -i1 

end 


We can then merely substitute these blocks back into the simple “big-picture” frame- 


work to give the final result: 


function quadroots(a, b, c) 

% quadroots: roots of quadratic equation 

%  quadroots(a,b,c): real and complex roots 
% of quadratic equation 
% input: 

% a= second-order coefficient 

% b= first-order coefficient 

% c= zero-order coefficient 


% output 

% l= real part of first root 

% il = imaginary part of first root 
% 2 = real part of second root 

% i2 = imaginary part of second root 
if a == 


%special cases 
if b ~= 0 
%single root 
rl=-c/b 
else 
%trivial solution 
disp('Trivial solution. Try again') 
end 


else 


“quadratic formula 
(ol 2 [oy 8 2 AA ey Fee discriminant 
if d >= 0 

%real roots 

(eal = (a Sama 7 e ab) 

reic DE esc ite (cl) e/a (2s) 


else 
%complex roots 
rl =-b / (2 * a) 
i1 = sqrt(abs(d)) / (2 * a) 
r2=rl 
i2 = -i1 
end 


end 
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As highlighted by the shading, notice how indentation helps to make the underlying 
logical structure clear. Also notice how “modular” the structures are. Here is a command 
window session illustrating how the function performs: 


>> quadroots(1,1,1) 


rl = 
-0.5000 
il = 
0.8660 
r2 = 
-0.5000 
i2 = 
-0.8660 
>> quadroots(1,5,1) 


r1 = 
-0.2087 
r2 = 
-4,7913 

>> quadroots(0,5,1) 
rl = 
-0.2000 


>> quadroots(0,0,0) 


Trivial solution. Try again 


3.5 


PASSING FUNCTIONS TO M-FILES 


Much of the remainder of the book involves developing functions to numerically eval- 
uate other functions. Although a customized function could be developed for every 
new equation we analyzed, a better alternative is to design a generic function and 
pass the particular equation we wish to analyze as an argument. In the parlance of 
MATLAB, these functions are given a special name: function functions. Before de- 
scribing how they work, we will first introduce anonymous functions, which provide a 
handy means to define simple user-defined functions without developing a full-blown 
M-file. 


3.5.1 Anonymous Functions 


Anonymous functions allow you to create a simple function without creating an M-file. 
They can be defined within the command window with the following syntax: 


fhandle = @(arglist) expression 


where fhandle = the function handle you can use to invoke the function, arglist = a 
comma separated list of input arguments to be passed to the function, and expression = any 
single valid MATLAB expression. For example, 


>> f1=@(x,y) x^2 + y^2; 
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Once these functions are defined in the command window, they can be used just as other 
functions: 


>> f1(3,4) 
ans = 
25 


Aside from the variables in its argument list, an anonymous function can include vari- 
ables that exist in the workspace where it is created. For example, we could create an 
anonymous function f(x) = 4x? as 


>> a=4; 
>> b= 2; 
>> f2=@(x) a*x^b; 
>> f2(3) 


ans = 36 


Note that if subsequently we enter new values for a and b, the anonymous function 
does not change: 


>> a= 3; 
>> f2(3) 


ans = 36 


Thus, the function handle holds a snapshot of the function at the time it was created. If we 
want the variables to take on values, we must recreate the function. For example, having 
changed a to 3, 


>> f2=@(x) a*xb; 
with the result 


>> f2(3) 


ans = 
27 


It should be noted that prior to MATLAB 7, inline functions performed the same role 
as anonymous functions. For example, the anonymous function developed above, f1, could 
be written as 


>> fl=inline('x^2 + y42','x','y'); 
Although they are being phased out in favor of anonymous function, some readers might be 


using earlier versions, and so we thought it would be helpful to mention them. MATLAB 
help can be consulted to learn more about their use and limitations. 


3.5.2 Function Functions 


Function functions are functions that operate on other functions which are passed to it as 
input arguments. The function that is passed to the function function is referred to as the 
passed function. A simple example is the built-in function fplot, which plots the graphs of 
functions. A simple representation of its syntax is 


fplot( func, lims) 
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where func is the function being plotted between the x-axis limits specified by ims = [xmin 
xmax]. For this case, func is the passed function. This function is “smart” in that it automati- 
cally analyzes the function and decides how many values to use so that the plot will exhibit 
all the function’s features. 

Here is an example of how fplot can be used to plot the velocity of the free-falling 
bungee jumper. The function can be created with an anonymous function: 


>> vel=@(t) ... 
sqrt(9.81*68.1/0.25)*tanh(sqrt(9.81*0.25/68.1)*t); 


We can then generate a plot from ¢ = 0 to 12 as 
>> fplot(vel,[0 12]) 


The result is displayed in Fig. 3.3. 

Note that in the remainder of this book, we will have many occasions to use MATLAB’s 
built-in function functions. As in the following example, we will also be developing 
our own. 


FIGURE 3.3 
A plot of velocity versus time generated with the fplot function. 
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EXAMPLE 3.9 


Building and Implementing a Function Function 


Problem Statement. Develop an M-file function function to determine the average value 
of a function over a range. Illustrate its use for the bungee jumper velocity over the range 
from t= 0 to 12 s: 


v(t)= (= tanh (ya 


where g = 9.81, m = 68.1, and c} = 0.25. 


Solution. The average value of the function can be computed with standard MATLAB 
commands as 


>> t=linspace(0,12); 
>> v=sqrt(9.81*68.1/0.25)*tanh(sqrt(9.81*0.25/68.1)*t) ; 
>> mean(v) 


ans = 
36.0870 


Inspection of a plot of the function (Fig. 3.3) shows that this result is a reasonable estimate 
of the curve’s average height. 
We can write an M-file to perform the same computation: 


function favg = funcavg(a,b,n) 

% funcavg: average function height 

%  favg=funcavg(a,b,n): computes average value 
% of function over a range 
% input: 

% a= lower bound of range 

% b = upper bound of range 

% n= number of intervals 

% output: 

%  favg = average value of function 

x = linspace(a,b,n); 


y = func(x); 
favg = mean(y); 
end 


function f = func(t) 
f=sqrt(9.81*68.1/0.25)*tanh(sqrt(9.81*0.25/68.1)*t); 
end 

The main function first uses linspace to generate equally spaced x values across 
the range. These values are then passed to a subfunction func in order to generate the cor- 
responding y values. Finally, the average value is computed. The function can be run from 
the command window as 


>> funcavg (0,12,60) 
ans = 
36.0127 
Now let’s rewrite the M-file so that rather than being specific to func, it evaluates a 
nonspecific function name f that is passed in as an argument: 


function favg = funcavg (f,a,b,n) 
% funcavg: average function height 
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%  favg=funcavg(f,a,b,n): computes average value 
% of function over a range 
% input: 

% f= function to be evaluated 

% a= lower bound of range 

% b= upper bound of range 

% n= number of intervals 

% output: 

%  favg = average value of function 

x = linspace(a,b,n); 

y = f(x); 

favg = mean(y); 


Because we have removed the subfunction func, this version is truly generic. It can be run 
from the command window as 
>> vel=@(t) ... 


sqrt(9.81*68.1/0.25)*tanh(sqrt(9.81*0.25/68.1)*t) ; 
>> funcavg(vel,0,12,60) 


ans = 
36.0127 


To demonstrate its generic nature, funcavg can easily be applied to another case by 
merely passing it a different function. For example, it could be used to determine the aver- 
age value of the built-in sin function between 0 and 2z as 


>> funcavg(@sin,0,2*pi,180) 


ans = 
-6.3001e-017 


Does this result make sense? 

We can see that funcavg is now designed to evaluate any valid MATLAB expres- 
sion. We will do this on numerous occasions throughout the remainder of this text in a 
number of contexts ranging from nonlinear equation solving to the solution of differential 
equations. 


3.5.3 Passing Parameters 


Recall from Chap. 1 that the terms in mathematical models can be divided into depen- 
dent and independent variables, parameters, and forcing functions. For the bungee jumper 
model, the velocity (v) is the dependent variable, time (f) is the independent variable, the 
mass (m) and drag coefficient (c,) are parameters, and the gravitational constant (g) is the 
forcing function. It is commonplace to investigate the behavior of such models by perform- 
ing a sensitivity analysis. This involves observing how the dependent variable changes as 
the parameters and forcing functions are varied. 

In Example 3.9, we developed a function function, funcavg, and used it to determine 
the average value of the bungee jumper velocity for the case where the parameters were 
set at m = 68.1 and c4 = 0.25. Suppose that we wanted to analyze the same function, but 
with different parameters. Of course, we could retype the function with new values for each 
case, but it would be preferable to just change the parameters. 
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As we learned in Sec. 3.5.1, it is possible to incorporate parameters into anonymous 
functions. For example, rather than “wiring” the numeric values, we could have done the 
following: 


>> m=68.1;cd=0.25; 
>> vel=@(t) sqrt(9.81*m/cd)*tanh(sqrt(9.81*cd/m)*t) ; 
>> funcavg(vel,0,12,60) 


ans = 
36.0127 


However, if we want the parameters to take on new values, we must recreate the anony- 
mous function. 

MATLAB offers a better alternative by adding the term varargin as the function func- 
tion’s last input argument. In addition, every time the passed function is invoked within the 
function function, the term varargin{:} should be added to the end of its argument list (note 
the curly brackets). Here is how both modifications can be implemented for funcavg (omit- 
ting comments for conciseness): 


function favg = funcavg(f,a,b,n,varargin) 
x = linspace(a,b,n); 

= f(x,varargin{:}); 
favg = mean(y); 


When the passed function is defined, the actual parameters should be added at the end 
of the argument list. If we used an anonymous function, this can be done as in 


>> vel=@(t,m,cd) sqrt(9.81*m/cd)*tanh(sqrt(9.81*cd/m)*t) ; 
When all these changes have been made, analyzing different parameters becomes easy. To 
implement the case where m = 68.1 and c} = 0.25, we could enter 

>> funcavg(vel,0,12,60,68.1,0.25) 

ans = 


36.0127 


An alternative case, say m = 100 and c} = 0.28, could be rapidly generated by merely 
changing the arguments: 


>> funcavg(vel,0,12,60,100,0.28) 


ans = 
38.9345 


A New Approach for Passing Parameters. At the time of this edition’s develop- 
ment, MATLAB is going through a transition to a new and better way of passing 
parameters to function functions. As in the previous example, if the function being 
passed is 


>> vel=@(t,m,cd) sqrt(9.81*m/cd)*tanh(sqrt(9.81*cd/m)*t) ; 
then you invoke the function as 


>> funcavg(vel,0,12,60,68.1,0.25) 
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The Mathworks developers thought this approach was cumbersome, so they devised 
the following alternative 


>> funcavg(@(t) vel(t,68.1,0.25) ,0,12,60) 


Thus, the extra parameters are not strung out at the end making it clear that the parameter 
list is in the function. 

I’ve described both the “old” and the new way of passing parameters because MATLAB 
will maintain the old way in functions that support in order to minimize backwards incom- 
patibilities. So if you have old code that has worked in the past, there is no need to go back 
and convert old code to the new way. For new code, however, I strongly recommend using 
the new way, because it is easier to read and more versatile. 


cA Mev] 8-9 00]P) Me BUNGEE JUMPER VELOCITY 


Background. In this section, we will use MATLAB to solve the free-falling bungee jumper 
problem we posed at the beginning of this chapter. This involves obtaining a solution of 


dv Cq 
=p = a LL 


dt 

Recall that, given an initial condition for time and velocity, the problem involved itera- 
tively solving the formula, 
dv 
dt 
Now also remember that to attain good accuracy, we would employ small steps. Therefore, 
we would probably want to apply the formula repeatedly to step out from our initial time 
to attain the value at the final time. Consequently, an algorithm to solve the problem would 
be based on a loop. 


o E0 N 


Solution. Suppose that we started the computation at t = 0 and wanted to predict velocity 
at t = 12 s using a time step of At = 0.5 s. We would therefore need to apply the iterative 
equation 24 times—that is, 


eS 
EE A 


where n = the number of iterations of the loop. Because this result is exact (i.e., the ratio is 
an integer), we can use a for loop as the basis for the algorithm. Here’s an M-file to do this 
including a subfunction defining the differential equation: 


function vend = velocityl(dt, ti, tf, vi) 
% velocity1: Euler solution for bungee velocity 


% vend = velocityl(dt, ti, tf, vi) 

% Euler method solution of bungee 

% jumper velocity 

% input: 

% dt = time step (s) 

% ti = initial time (s) 

% tf = final time (s) 

% vi = initial value of dependent variable (m/s) 
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3.6 CASE STUDY continued 


vend = velocity at tf (m/s) 


Wile 

(Gite E 77 dE 
i= 1:n 
vdt = deriv(v); 
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end 
vend = v; 
end 


function dv = deriv(v) 
dy = 9e = (0725/168: 1i% yžabS(V) 
end 


This function can be invoked from the command window with the result: 
>> velocity1(0.5,0,12,0) 


ans = 
50.9259 


Note that the true value obtained from the analytical solution is 50.6175 (Example 3.1). 
We can then try a much smaller value of dt to obtain a more accurate numerical result: 


>> velocity1(0.001,0,12,0) 
ans = 
50.6181 
Although this function is certainly simple to program, it is not foolproof. In particu- 
lar, it will not work if the computation interval is not evenly divisible by the time step. 
To cover such cases, a while . . . break loop can be substituted in place of the shaded area 
(note that we have omitted the comments for conciseness): 


function vend = velocity2(dt, ti, tf, vi) 


t= ti; 
v= vi; 
h = dt; 
while(1) 
TH? ee oe z ee I Se = end 
dvdt = deriv(v); 
v =v + dvdt * h; 
t= t hie 
if t >= tf, break, end 
end 
vend = v; 
end 


function dv = deriv(v) 
day = 9.81 = (0:25 7 68:1) = we absi(v); 
end 


3.6 CASE STUDY 89 


3.6 CASE STUDY continued 


As soon as we enter the while loop, we use a single line if structure to test whether 
adding t + dt will take us beyond the end of the interval. If not (which would usually be 
the case at first), we do nothing. If so, we would shorten up the interval—that is, we set the 
variable step h to the interval remaining: tf - t. By doing this, we guarantee that the last 
step falls exactly on tf. After we implement this final step, the loop will terminate because 
the condition t >= tf will test true. 

Notice that before entering the loop, we assign the value of the time step dt to another 
variable h. We create this dummy variable so that our routine does not change the given 
value of dt if and when we shorten the time step. We do this in anticipation that we might 
need to use the original value of dt somewhere else in the event that this code were inte- 
grated within a larger program. 

If we run this new version, the result will be the same as for the version based on the 
for loop structure: 


>> velocity2(0.5,0,12,0) 


ans = 
50.9259 


Further, we can use a dt that is not evenly divisible into tf - ti: 


>> velocity2(0.35,0,12,0) 


ans = 
50.8348 


We should note that the algorithm is still not foolproof. For example, the user could 
have mistakenly entered a step size greater than the calculation interval (e.g., tf - ti = 5and 
dt = 20). Thus, you might want to include error traps in your code to catch such errors and 
then allow the user to correct the mistake. 

As a final note, we should recognize that the foregoing code is not generic. That is, we 
have designed it to solve the specific problem of the velocity of the bungee jumper. A more 
generic version can be developed as 


function yend = odesimp(dydt, dt, ti, tf, yi) 
t= ti; y = yi; h = dt; 
while (1) 

We i oP le Ste, I See See Elite! 


Vem E CAG E dale 
t=tt+h; 
if t >= tf, break, end 
end 
yend = y; 


Notice how we have stripped out the parts of the algorithm that were specific to the 
bungee example (including the subfunction defining the differential equation) while keep- 
ing the essential features of the solution technique. We can then use this routine to solve 
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3.6 CASE STUDY continued 


the bungee jumper example, by specifying the differential equation with an anonymous 
function and passing its function handle to odesimp to generate the solution 


>> dvdt=@(v) 9.81-(0.25/68.1)*v*abs(v) ; 
>> odesimp(dvdt,0.5,0,12,0) 


ans = 
5049259 


We could then analyze a different function without having to go in and modify the 
M-file. For example, if y = 10 at ż = 0, the differential equation dy/dt = —0.1y has the 
analytical solution y = 10e~°". Therefore, the solution at t = 5 would be y(5) = 10e ™!® = 
6.0653. We can use odesimp to obtain the same result numerically as in 


>> odesimp(@(y) -0.1*y,0.005,0,5,10) 


ans = 
6.0645 


Finally, we can use varargin and the new way of passing parameters to develop a final 
and superior version. To do this, the odesimp function is first modified by adding the high- 
lighted code 


function yend = odesimp2(dydt, dt, ti, tf, yi, varargin) 
t= ti; y = yi; h = dt; 
while (1) 

Tir oP dte t I) Sap = eend 


y = y + dydt(y, varargin{:}) * h; 
eS te ae lds 
if t >= tf, break, end 

end 

yend = y; 


Then, we can develop a script to perform the computation, 


elie 

format compact 

dvdt=@(v,cd,m) 9.81-(cd/m)*v*abs(v) ; 
odesimp2(@(v) dvdt(v,0.25,68.1),0.5,0,12,0) 


which yields the correct result 


ans = 
50.9259 


PROBLEMS 
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3.1 Figure P3.1 shows a cylindrical tank with a conical 
base. If the liquid level is quite low, in the conical part, the 
volume is simply the conical volume of liquid. If the liquid 
level is midrange in the cylindrical part, the total volume of 
liquid includes the filled conical part and the partially filled 
cylindrical part. 

Use decisional structures to write an M-file to compute 
the tank’s volume as a function of given values of R and d. 
Design the function so that it returns the volume for all cases 


=a 
_—— 
2R 
d 
R 
FIGURE P3.1 


where the depth is less than 3R. Return an error message 
(“Overtop”) if you overtop the tank—that is, d > 3R. Test it 
with the following data: 


AR 
=) 
Ko) 
—= 
o 

œo w 

Aere 

Ow 


Note that the tank’s radius is R. 

3.2 An amount of money P is invested in an account where 
interest is compounded at the end of the period. The future 
worth F yielded at an interest rate i after n periods may be 
determined from the following formula: 


F=P(1+i)" 


Write an M-file that will calculate the future worth of an 
investment for each year from 1 through n. The input to the 
function should include the initial investment P, the interest 
rate i (as a decimal), and the number of years n for which the 
future worth is to be calculated. The output should consist 
of a table with headings and columns for n and F. Run the 
program for P = $100,000, i = 0.05, and n = 10 years. 

3.3 Economic formulas are available to compute annual 
payments for loans. Suppose that you borrow an amount 
of money P and agree to repay it in n annual payments at 


an interest rate of i. The formula to compute the annual 
payment A is 


Write an M-file to compute A. Test it with P = $100,000 and 
an interest rate of 3.3% (i = 0.033). Compute results for n = 
1, 2, 3, 4, and 5 and display the results as a table with head- 
ings and columns for n and A. 

3.4 The average daily temperature for an area can be ap- 
proximated by the following function: 


T= deer + (Teak E Tean) cos(a(t E fpe) 


where T,,.4, = the average annual temperature, T eak = the 
peak temperature, œ = the frequency of the annual varia- 
tion (= 22/365), and tak = day of the peak temperature 


(205 d). Parameters for some U.S. towns are listed here: 


City Tan (°C) T peak (°C) 
Miami, FL 22.1 28.3 
Yuma, AZ 23.1 33.6 
Bismarck, ND 5.2 221 
Seattle, WA 10.6 17.6 
Boston, MA 10:7 22.9 


Develop an M-file that computes the average temperature 
between two days of the year for a particular city. Test it 
for (a) January-February in Yuma, AZ (t = 0 to 59) and 
(b) July-August temperature in Seattle, WA (t = 180 to 242). 
3.5 The sine function can be evaluated by the following 
infinite series: 


3 5 
inzr=x-Æ& +% -—... 
sin x =x 31 + 51 
Create an M-file to implement this formula so that it com- 
putes and displays the values of sin x as each term in the 
series is added. In other words, compute and display in 


sequence the values for 


sinx =x 

3 
sinx =x- 

3! 

3 5 
ree Cae 
sinx=x 31 +3 
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up to the order term of your choosing. For each of the pre- 
ceding, compute and display the percent relative error as 


true — series approximation 


x 100% 
true 


%error = 


As a test case, employ the program to compute sin(0.9) 
for up to and including eight terms—that is, up to the term 
x!9/15). 

3.6 Two distances are required to specify the location 
of a point relative to an origin in two-dimensional space 
(Fig. P3.6): 


e The horizontal and vertical distances (x, y) in Cartesian 
coordinates. 
e The radius and angle (r, 0) in polar coordinates. 


It is relatively straightforward to compute Cartesian coor- 
dinates (x, y) on the basis of polar coordinates (r, 0). The 
reverse process is not so simple. The radius can be computed 
by the following formula: 


r= x+y? 


If the coordinates lie within the first and fourth coordi- 
nates (i.e., x > 0), then a simple formula can be used to 
compute 0: 


-1(¥ 
0 = tan”! (2) 
ll 4 | 
lll IV 
FIGURE P3.6 


The difficulty arises for the other cases. The following table 
summarizes the possibilities: 


x y 0 
<0 >0 tan™(y/x) +2 
<0 <0 tan™'(y/x) — z 
<0 =0 m 
=0 >0 z/2 
=0 <0 —12 
=0 =0 (0) 


Write a well-structured M-file using if. ..elseif structures 
to calculate r and @ as a function of x and y. Express the final 
results for 8 in degrees. Test your program by evaluating the 
following cases: 


x y r 0 
2 0 
2 1 
0 3 
-3 1 
-2 0 
-1 -2 
0 0 
0 -2 
2 2 


3.7 Develop an M-file to determine polar coordinates as 
described in Prob. 3.6. However, rather than designing the 
function to evaluate a single case, pass vectors of x and y. 
Have the function display the results as a table with columns 
for x, y, r, and 8. Test the program for the cases outlined in 
Prob. 3.6. 

3.8 Develop an M-file function that is passed a numeric 
grade from 0 to 100 and returns a letter grade according to 
the scheme: 


Letter Criteria 
A 90 < numeric grade < 100 
B 80 < numeric grade < 90 
€ 70 < numeric grade < 80 
D 60 < numeric grade < 70 
F numeric grade < 60 


The first line of the function should be 


function grade = lettergrade(score) 
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20 kips/ft 


150 kip-ft 15 kips 


oT 


PA = 24 


FIGURE P3.10 


Design the function so that it displays an error message and 
terminates in the event that the user enters a value of score 
that is less than zero or greater than 100. Test your function 
with 89.9999, 90, 45, and 120. 

3.9 Manning’s equation can be used to compute the velocity 
of water in a rectangular open channel: 


S 


= VS | BH y" 


B+ 2H 


where U = velocity (m/s), S = channel slope, n = roughness 
coefficient, B = width (m), and H = depth (m). The follow- 
ing data are available for five channels: 


n S B H 
0.036 0.0001 10 2 
0.020 0.0002 8 1 
0.015 0.0012 20 1.5 
0.030 0.0007 25 3 
0.022 0.0003 15 2.6 


Write an M-file that computes the velocity for each of 
these channels. Enter these values into a matrix where each 
column represents a parameter and each row represents a 
channel. Have the M-file display the input data along with 
the computed velocity in tabular form where velocity is 
the fifth column. Include headings on the table to label the 
columns. 

3.10 A simply supported beam is loaded as shown in 
Fig. P3.10. Using singularity functions, the displacement 
along the beam can be expressed by the equation: 


u(x) = 2 U(x — 0) — (x — 5)4] + 42 (x 8) 


+75 (x—7)2 + a — 238.25x 


By definition, the singularity function can be expressed as 
follows: 


(x = a)" 


a-a 


Develop an M-file that creates a plot of displacement 
(dashed line) versus distance along the beam, x. Note that 
x = 0 at the left end of the beam. 

3.11 The volume V of liquid in a hollow horizontal cylinder of 
radius r and length L is related to the depth of the liquid h by 


V=[P cost (24) - e- ny V2rh- P| L 


i 
when x <a 


Develop an M-file to create a plot of volume versus depth. 
Here are the first few lines: 


function cylinder(r, L, plot_title) 
% volume of horizontal cylinder 


% inputs: 
% r = radius 
% L = length 


% plot_title = string holding plot title 


Test your program with 


>> cylinder(3,5,... 
‘Volume versus depth for horizontal... 
tank') 


cylindrical 


3.12 Develop a vectorized version of the following code: 


tstart=0; tend=20; ni=8; 
t(1)=tstart; 
y(1)=12 + 6*cos(2*pi*t(1)/(tend-tstart) ); 
for 1=2:ni+1 
t(i)=t(i-1)+(tend-tstart)/ni; 
y(i)=12 + 6*cos(2*pi*t(i)/ ... 
(tend-tstart) ); 
end 
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3.13 The “divide and average” method, an old-time method 
for approximating the square root of any positive number a, 
can be formulated as 


_xta/x 
oa 


Write a well-structured M-file function based on the 
while...break loop structure to implement this algorithm. 
Use proper indentation so that the structure is clear. At each 
step estimate the error in your approximation as 


Xnew — Xola 


new 


Xnew 


Repeat the loop until ¢ is less than or equal to a specified 
value. Design your program so that it returns both the re- 
sult and the error. Make sure that it can evaluate the square 
root of numbers that are equal to and less than zero. For the 
latter case, display the result as an imaginary number. For 
example, the square root of —4 would return 27. Test your 
program by evaluating a = 0, 2, 10 and —4 for e = 1 x 10%. 
3.14 Piecewise functions are sometimes useful when the re- 
lationship between a dependent and an independent variable 
cannot be adequately represented by a single equation. For 
example, the velocity of a rocket might be described by 


10° — 5t 0<t<8 
624 — 3t 8<t<16 
v(t) =} 36t + 12(t — 16}? 16<t<26 
21360 4 M76) t>26 
0 otherwise 


Develop an M-file function to compute v as a function of t. 
Then, develop a script that uses this function to generate a 
plot of v versus t for t = —5 to 50. 

3.15 Develop an M-file function called rounder to round a 
number x to a specified number of decimal digits, n. The first 
line of the function should be set up as 


function xr = rounder(x, n) 


Test the program by rounding each of the following to 2 dec- 
imal digits: x = 477.9587, —477.9587, 0.125, 0.135, —0.125, 
and —0.135. 

3.16 Develop an M-file function to determine the elapsed 
days in a year. The first line of the function should be set 
up as 


function nd = days(mo, da, leap) 


where mo = the month (1-12), da = the day (1-31), and leap = 
(0 for non—leap year and 1 for leap year). Test it for January 
1, 1997, February 29, 2004, March 1, 2001, June 21, 2004, 


and December 31, 2008. Hint: A nice way to do this com- 
bines the for and the switch structures. 

3.17 Develop an M-file function to determine the elapsed 
days in a year. The first line of the function should be set 
up as 


function nd = days(mo, da, year) 


where mo = the month (1-12), da = the day (1-31), and 
year = the year. Test it for January 1, 1997, February 29, 
2004, March 1, 2001, June 21, 2004, and December 31, 2008. 
3.18 Develop a function function M-file that returns the 
difference between the passed function’s maximum and 
minimum value given a range of the independent variable. 
In addition, have the function generate a plot of the function 
for the range. Test it for the following cases: 

(a) fit) = 8e~°sin(t — 2) from t = 0 to 62. 

(b) f(x) = e*sin(1/x) from x = 0.01 to 0.2. 

(c) The built-in humps function from x = 0 to 2. 

3.19 Modify the function function odes imp developed at the 
end of Sec. 3.6 so that it can be passed the arguments of the 
passed function. Test it for the following case: 


>> dvdt=@(v,m,cd) 9.81-(cd/m)*v42; 
>> odesimp(dvdt,0.5,0,12,-10,70,0.23) 


3.20 A Cartesian vector can be thought of as representing 
magnitudes along the x-, y-, and z-axes multiplied by a unit 
vector (i, j, k). For such cases, the dot product of two of these 
vectors {a} and {b} corresponds to the product of their mag- 
nitudes and the cosine of the angle between their tails as in 


{a}-{b} = ab cos@ 


The cross product yields another vector, {c} = {a} x {b}, 
which is perpendicular to the plane defined by {a} and {b} 
such that its direction is specified by the right-hand rule. 
Develop an M-file function that is passed two such vectors 
and returns 0, {c} and the magnitude of {c}, and generates 
a three-dimensional plot of the three vectors {a}, {b}, and 
{c) with their origins at zero. Use dashed lines for {a} and 
{b} and a solid line for {c}. Test your function for the fol- 
lowing cases: 


(a) a = [6 4 2]; b= [2 6 4]; 

(b) a = [3 2 -6]; b = [4 -3 1]; 
(c) a= [2 -2 1]; b = [4 2 -4]; 
(d) a = [-1 0 0]; b = [0 -1 0]; 


3.21 Based on Example 3.7, develop a script to produce an 
animation of a bouncing ball where v, = 5 m/s and 0, = 50°. 
To do this, you must be able to predict exactly when the ball 
hits the ground. At this point, the direction changes (the new 
angle will equal the negative of the angle at impact), and the 
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velocity will decrease in magnitude to reflect energy loss 
due to the collision of the ball with the ground. The change 
in velocity can be quantified by the coefficient of restitution 
Cp which is equal to the ratio of the velocity after to the 
velocity before impact. For the present case, use a value of 
Cr = 0.8. 

3.22 Develop a function to produce an animation of a par- 
ticle moving in a circle in Cartesian coordinates based on 
radial coordinates. Assume a constant radius, r, and allow 
the angle, 0, to increase from zero to 2x in equal increments. 
The function’s first lines should be 


function phasor(r, nt, nm) 

% function to show the orbit of a phasor 
% r = radius 

% nt = number of increments for theta 

% nm = number of movies 


Test your function with 
phasor(1, 256, 10) 


3.23 Develop a script to produce a movie for the butterfly 
plot from Prob. 2.22. Use a particle located at the x-y coordi- 
nates to visualize how the plot evolves in time. 

3.24 Develop a MATLAB script to compute the velocity, v, 
and position, z, of a hot air balloon as described in Prob. 1.28. 
Perform the calculation from t = 0 to 60 s with a step size of 
1.6 s. At z= 200 m, assume that part of the payload (100 kg) 
is dropped out of the balloon. Your script should be struc- 
tured like: 


% YourFul 1Name 
% Hot Air Balloon Script 


clear,clc,clf 


g=9.81; 
global g 


% set parameters 

r=8.65; % balloon radius 

CD=0.47; % dimensionless drag coefficient 
mP=265; % mass of payload 

P=101300; % atmospheric pressure 

Rgas=287; % Universal gas constant for dry air 
TC=100; % air temperature 

rhoa=1.2; % air density 

zd=200; % elevation at which mass is jettisoned 
md=100; % mass jettisoned 

ti=0; % initial time (s) 

tf=60; % final time (s) 

vi=0; % initial velocity 

zi=0; % initial elevation 

dt=1.6; % integration time step 


% precomputations 

d=2* r; Ta= TC + 273.15; Ab = pi /4*d% 2; 

Vb = pi / 6 * d ^ 3; rhog = P / Rgas / Ta; mG = Vb * 
rhog; 

FB = Vb * rhoa * g; FG = mG * g; cdp = rhoa * Ab * 
CD / 2; 


% compute times, velocities and elevations 
[t,y] = Balloon(FB, FG, mG, cdp, mP, md, zd, 
ti,vi,zi,tf,dt); 


% Display results 
Your code to display a nice labeled table of times, 
velocities, and elevations 


% Plot results 
Your code to create a nice labeled plot of velocity 
and elevation versus time. 


Your function should be structured like: 


function [tout, yout ]=Balloon(FB, FG, mG, cdp, mP, md, 
zd, ti,vi,zi,tf,dt) 
global g 


% balloon 

% function [tout, yout]=Balloon(FB, FG, mG, cdp, mP1, 
md, zd, ti,vi,zi,tf,dt) 

% Function to generate solutions of vertical 
velocity and elevation 

% versus time with Euler's method for a hot air 
balloon 

% Input: 

% FB = buoyancy force (N) 

% FG = gravity force (N) 

% mG = mass (kg) 

% cdp=dimensional drag coefficient 

% mP= mass of payload (kg) 

% md=mass jettisoned (kg) 

% zd=elevation at which mass is jettisoned (m) 

% ti = initial time (s) 

% wi=initial velocity (m/s) 

% zi=initial elevation (m) 

% tf = final time (s) 

% dt=integration time step (s) 

% Output: 

% tout = vector of times (s) 

% yout[:,1] = velocities (m/s) 
% yout[:,2] = elevations (m) 

% Code to implement Euler’s method to compute output 
and plot results 


> 


> 


> 


> 


> 


> 


> 


> 


> 


> 


3.25 A general equation for a sinusoid can be written as 


y(t) = y + Ay sin 2aft — p) 
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where y = the dependent variable, y = the mean value, Ay = 
the amplitude, f= the ordinary frequency (i.e., the number of 
oscillations that occur each unit of time), t = the independent 
variable (in this case time), and ¢ = phase shift. Develop 
a MATLAB script to generate a 5 panel vertical plot to il- 
lustrate how the function changes as the parameters change. 
On each plot display the simple sine wave, y(t) = sin(2z?), as 
a red line. Then, add the following functions to each of the 
5 panels as black lines: 


subplot Function Title 

5,1,1 yt)=1+4 sin(2zt) (a) Effect of mean 

5,1,2 = y(t) = 2sin(2zt) (b) Effect of amplitude 

5,1,3 y(t)=sin(4at) (c) Effect of frequency 

5,1,4 = y(t)=sin(2at— 2/4) (d) Effect of phase shift 

5,1,5 = y(t) = cos(2at—-2/2) (e) Relationship of sine to cosine 


Employ a range of t = 0 to 27 and scale each subplot so that 
the abscissa goes from 0 to 2z and the ordinate goes from —2 
to 2. Include the titles with each subplot, label each subplot’s 
ordinate as 'f(t)', and the bottom plot’s abscissa as 't'. 
3.26 A fractal is a curve or geometric figure, each part of 
which has the same statistical character as the whole. Fractals 
are useful in modeling structures (such as eroded coastlines or 
snowflakes) in which similar patterns recur at progressively 
smaller scales, and in describing partly random or chaotic 
phenomena such as crystal growth, fluid turbulence, and gal- 
axy formation. Devaney (1990) has written a nice little book 
that includes a simple algorithm to create an interesting fractal 
pattern. Here is a step-by-step description of this algorithm: 


Step 1: Assign value to mand n and set hold on. 

Step 2: Start a for loop to iterate over ii = 1:100000 

Step 3: Compute a random number, q = 3*rand(1) 

Step 4: If the value of q is less than 1 go to Step 5. Otherwise 


Step 7: Compute new values form = m/2 and n = (300 + 
n)/2, and then go to Step 9. 

Step 8: Compute new values for m = (300 + m)/2 andn = 
(300 + n)/2. 

Step 9: If 7 is less than 1000 then go to Step 10. Otherwise, 
go to Step 11. 

Step 10: Plot a point at the coordinate,(m,n). 

Step 11: Terminate ii loop. 

Step 12: Set hold off. 


Develop a MATLAB script for this algorithm using for and 
if structures. Run it for the following two cases (a) m = 2 
andn = 1 and (b) m = 100 and n = 200. 

3.27 Write a well-structured MATLAB function procedure 
named Fnorm to calculate the Frobenius norm of an mxn 
matrix, 


A= È È a a 
i=1 j= 


Here is a script that uses the function 


A=[579; 184; 762]; 
Fn = Fnorm(A) 


Here is the first line of the function 
function Norm = Fnorm(x) 


Develop two versions of the function: (a) using nested for 
loops and (b) using sum functions. 

3.28 The pressure and temperature of the atmosphere are 
constantly changing depending on a number of factors in- 
cluding altitude, latitude/longitude, time of day, and season. 
To take all these variations into account when considering 
the design and performance of flight vehicles is impracti- 
cal. Therefore, a standard atmosphere is frequently used to 
provide engineers and scientists with a common reference 


go to Step 6. for their research and development. The International Stan- 
Step 5: Compute new values for m = m/2 and n = n/2and dard Atmosphere is one such model of how conditions of the 

then go to Step 9. earth’s atmosphere change over a wide range of altitudes or 
Step 6: If the value of q is less than 2 go to Step 7. Otherwise elevations. The following table shows values of temperature 

go to Step 8. and pressure at selected altitudes. 

Base Geopotential Lapse Rate Base Temperature Base 
Layer Index, i Layer Name Altitude Above MSL, h (km) (°C/km) T (°C) Pressure, p (Pa) 

1 Troposphere 0 -6.5 15 101,325 

2 Tropopause 11 0 —56.5 22,632 

3 Stratosphere 20 1 —56.5 5474.9 

4 Stratosphere 32 2.8 -44.5 868.02 

5 Stratopause 47 0 -2.5 110.91 

6 Mesosphere 51 -2.8 -2.5 66.939 

7 Mesosphere 71 -2.0 —58.5 3.9564 

8 Mesopause 84.852 — —86.2 0.3734 
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The temperature at each altitude can then be computed as 


Th) =T, + y; (h — h;) h;< h< hı 


where 7(h) = temperature at altitude h (°C), T, = the base 
temperature for layer i (°C), y; = lapse rate or the rate 
at which atmospheric temperature decreases linearly with 
increase in altitude for layer i (°C/km), and h, = base geo- 
potential altitude above mean sea level (MSL) for layer i. 
The pressure at each altitude can then be computed as 
ph) =p + h- h) 


174; 


it 
where p(h) = pressure at altitude h(Pa = N/m’), p; = the 
base pressure for layer i (Pa). The density, p(kg/m°), can 
then be calculated according to a molar form of the ideal 
gas law: 
M 
P= Rr 


where M = molar mass (= 0.0289644 kg/mol), R = the uni- 
versal gas constant (8.3144621 J/(mol- K), and T, = absolute 
temperature (K) = T + 273.15. 

Develop a MATLAB function, StdAtm, to deter- 
mine values of the three properties for a given altitude. 
If the user requests a value outside the range of altitudes, 
have the function display an error message and terminate 
the application. Use the following script as the starting point 
to create a 3-panel plot of altitude versus the properties. 


% Script to generate a plot of temperature, pressure 
and density 

% for the International Standard Atmosphere 

clc,clf 

h=[0 11 20 32 47 51 71 84.852]; 

gamma=[-6.5 0 1 2.8 0 -2.8 -2]; 

T=[15 -56.5 -56.5 -44.5 -2.5 -2.5 -58.5 -86.28]; 
p=[101325 22632 5474.9 868.02 110.91 66.939 3.9564 
0.3734]; 

hint=[0:0.1:84.852]; 

for i=1:length(hint) 
[Tint(i),pint(i),rint(i)]=StdAtm(h,T,p,gamma, hint 
(i)); 


end 


% Create plot 
% Function call to test error trap 
[Tint(i),pint(i),rint(i)]=StdAtm(h,T,p,gamma,85); 


3.29 Develop a MATLAB function to convert a vec- 
tor of temperatures from Celsius to Fahrenheit and vice 
versa. Test it with the following data for the average 
monthly temperatures at Death Volley, CA and at the 
South Pole. 


Death Valley South Pole 
Day °F °C 
15 54 -27 
45 60 —40 
75 69 -53 
105 77 -56 
135 87 -57 
165 96 -57 
195 102 -59 
225 101 -59 
255 92 -59 
285 78 -50 
315 63 -38 
345 52 -27 


Use the following script as the starting point to gener- 
ate a 2-panel stacked plot of the temperatures versus day for 
both of the sites with the Celsius time series at the top and 
the Fahrenheit at the bottom. If the user requests a unit other 
than 'C' or 'F', have the function display an error message 
and terminate the application. 


% Script to generate stacked plots of temperatures 
versus time 

% for Death Valley and the South Pole with Celsius 
time series 

% on the top plot and Fahrenheit on the bottom. 


clc,clf 

t=[15 45 75 105 135 165 195 225 255 285 315 345]; 
TFDV=[54 60 69 77 87 96 102 101 92 78 63 52]; 
TCSP=[-27 -40 -53 -56 -57 -57 -59 -59 -59 -50 
-38 -27]; 

TCDV=TempConv(TFDV,'C'); 

TFSP=TempConv(TCSP,'F'); 


% Create plot 


% Test of error trap 
TKSP=TempConv(TCSP,'K'); 


3.30 Because there are only two possibilities, as in 
Prob. 3.29, it’s relatively easy to convert between Cel- 
sius and Fahrenheit temperature units. Because there are 
many more units in common use, converting between 
pressure units is more challenging. Here are some of the 
possibilities along with the number of Pascals represented 
by each: 
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Index, i Unit, U, Description or Usage # of Pa, C; 
1 psi Tire pressure, above ambient pressure 6894.76 
2 atm Used for high pressure experiments 101,325 
3 inHg Atmospheric pressure given by weatherperson 3376.85 
4 kg/cm? Europe metric unit in situations where psi used in U.S. 98,066.5 
5 inH O Used in heating/ventilating systems in buildings 248 .843 
6 Pa Standard SI (metric) unit of pressure, 1 N/m? 1 
7 bar Frequented used by meteorologists 100,000 
8 dyne/cm? Older scientific pressure unit from the CGS system 0.1 
9 ftH O American and English low value pressure unit. 2988.98 

10 mmHg Used in laboratory pressure measurements 133.322 
11 torr Same as 1 mmHg but used in vacuum measurement 133.322 
12 ksi Used in structural engineering 6,894,760 


The information in the table can be used to implement a 
conversion calculation. One way to do this is to store both 
the units and the corresponding number of the Pascals in 
individual arrays with the subscripts corresponding to each 
entry. For example, 


U(1)='psi' C(1)=6894.76 
U(2)='atm' C(2)=101325. 


The conversion from one unit to another could then be com- 
puted with the following general formula: 


C; 
P= P: 


i 


where P, = given pressure, P4 = desired pressure, j = the 
index of the desired unit, and i = the index of the given unit. 
As an example, to convert a tire pressure of say 28.6 psi to 
atm, we would use 


p, = & p, = 101325. atm/Pa 


= ©, Pe = 6894.76 psi/Pa 28.6 psi = 420.304 atm 


So we see that the conversion from one unit to another in- 
volves first determining the indices corresponding to the 
given and the desired units, and then implementing the 
conversion equation. Here is a step-by-step algorithm to 
do this: 


1. Assign the values of the unit, U, and conversion, C, arrays. 
2. Have the user select the input units by entering the 
value of i. 


If the user enters a correct value within the range, 1-12, 
continue to step 3. 


If the user enters a value outside the range, display an 
error message and repeat step 2. 


ae 


Have the user enter the given value of pressure, P,. 
4. Have the user select the desired units by entering the 
value of j. 


If the user enters a correct value within the range, 1—12, 
continue to step 5. 


If the user enters a value outside the range, display an 
error message and repeat step 4. 


5. Use the formula to convert the quantity in input units to 
its value in the desired output units. 

6. Display the original quantity and units and the output 
quantity and units. 

7. Ask if another output result, for the same input, is 
desired. 


If yes, go back to step 4 and continue from there. 
If no, go on to step 8. 


8. Ask if another conversion is desired. 


If yes, go back to step 2 and continue from there. 
If no, end of algorithm. 


Develop a well-structured MATLAB script using loop and 
if structures to implement this algorithm. Test it with the 
following: 


(a) Duplicate the hand calculation example to ensure that 
you get about 420.304 atm for an input of 28.6 psi. 

(b) Try to enter a choice code of i = 13. Does the program 
trap that error and allow you to correct it? If not, it 
should. Now, try a choice code of the letter Q. What 
happens? 

(c) For the yes/no questions in the program, try a letter other 
than y or n. Does the program scold you and allow you 
to correct your response? What happens if you enter a 
number like 1.3 as a response? What happens if you just 
press the Enter key with no response? 


Roundoff and Truncation Errors 


CHAPTER OBJECTIVES 


The primary objective of this chapter is to acquaint you with the major sources of 
errors involved in numerical methods. Specific objectives and topics covered are 


Understanding the distinction between accuracy and precision. 

Learning how to quantify error. 

Learning how error estimates can be used to decide when to terminate an iterative 
calculation. 

Understanding how roundoff errors occur because digital computers have a 
limited ability to represent numbers. 

Understanding why floating-point numbers have limits on their range and 
precision. 

Recognizing that truncation errors occur when exact mathematical formulations 
are represented by approximations. 

Knowing how to use the Taylor series to estimate truncation errors. 
Understanding how to write forward, backward, and centered finite-difference 
approximations of first and second derivatives. 

Recognizing that efforts to minimize truncation errors can sometimes increase 
roundoff errors. 


YOU’VE GOT A PROBLEM 


n Chap. 1, you developed a numerical model for the velocity of a bungee jumper. To 
solve the problem with a computer, you had to approximate the derivative of velocity 
with a finite difference: 


dv ~ Av _ V(t) — 0) 
dt At tuh 
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Thus, the resulting solution is not exact—that is, it has error. 

In addition, the computer you use to obtain the solution is also an imperfect tool. Be- 
cause it is a digital device, the computer is limited in its ability to represent the magnitudes 
and precision of numbers. Consequently, the machine itself yields results that contain error. 

So both your mathematical approximation and your digital computer cause your re- 
sulting model prediction to be uncertain. Your problem is: How do you deal with such 
uncertainty? In particular, is it possible to understand, quantify, and control such errors 
in order to obtain acceptable results? This chapter introduces you to some approaches and 
concepts that engineers and scientists use to deal with this dilemma. 


ERRORS 


Engineers and scientists constantly find themselves having to accomplish objectives based 
on uncertain information. Although perfection is a laudable goal, it is rarely if ever at- 
tained. For example, despite the fact that the model developed from Newton’s second law 
is an excellent approximation, it would never in practice exactly predict the jumper’s fall. 
A variety of factors such as winds and slight variations in air resistance would result in 
deviations from the prediction. If these deviations are systematically high or low, then we 
might need to develop a new model. However, if they are randomly distributed and tightly 
grouped around the prediction, then the deviations might be considered negligible and the 
model deemed adequate. Numerical approximations also introduce similar discrepancies 
into the analysis. 

This chapter covers basic topics related to the identification, quantification, and mini- 
mization of these errors. General information concerned with the quantification of error 
is reviewed in this section. This is followed by Sections 4.2 and 4.3, dealing with the 
two major forms of numerical error: roundoff error (due to computer approximations) and 
truncation error (due to mathematical approximations). We also describe how strategies to 
reduce truncation error sometimes increase roundoff. Finally, we briefly discuss errors not 
directly connected with the numerical methods themselves. These include blunders, model 
errors, and data uncertainty. 


4.1.1 Accuracy and Precision 


The errors associated with both calculations and measurements can be characterized with 
regard to their accuracy and precision. Accuracy refers to how closely a computed or mea- 
sured value agrees with the true value. Precision refers to how closely individual computed 
or measured values agree with each other. 

These concepts can be illustrated graphically using an analogy from target practice. 
The bullet holes on each target in Fig. 4.1 can be thought of as the predictions of a numeri- 
cal technique, whereas the bull’s-eye represents the truth. Inaccuracy (also called bias) is 
defined as systematic deviation from the truth. Thus, although the shots in Fig. 4.1c are 
more tightly grouped than in Fig. 4.1a, the two cases are equally biased because they are 
both centered on the upper left quadrant of the target. Imprecision (also called uncertainty), 
on the other hand, refers to the magnitude of the scatter. Therefore, although Fig. 4.1b and 
d are equally accurate (i.e., centered on the bull’s-eye), the latter is more precise because 
the shots are tightly grouped. 
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Increasing accuracy 


(b) 


Increasing precision 


(a) 


FIGURE 4.1 

An example from marksmanship illustrating the concepts of accuracy and precision: 
(a) inaccurate and imprecise, (b) accurate and imprecise, (c) inaccurate and precise, 
and (d) accurate and precise. 


Numerical methods should be sufficiently accurate or unbiased to meet the require- 
ments of a particular problem. They also should be precise enough for adequate design. 
In this book, we will use the collective term error to represent both the inaccuracy and 
imprecision of our predictions. 


4.1.2 Error Definitions 


Numerical errors arise from the use of approximations to represent exact mathematical 
operations and quantities. For such errors, the relationship between the exact, or true, result 
and the approximation can be formulated as 


True value = approximation + error (4.1) 


By rearranging Eq. (4.1), we find that the numerical error is equal to the discrepancy 
between the truth and the approximation, as in 


E, = true value — approximation (4.2) 


where E, is used to designate the exact value of the error. The subscript ¢ is included to 
designate that this is the “true” error. This is in contrast to other cases, as described shortly, 
where an “approximate” estimate of the error must be employed. Note that the true error is 
commonly expressed as an absolute value and referred to as the absolute error. 

A shortcoming of this definition is that it takes no account of the order of magnitude of 
the value under examination. For example, an error of a centimeter is much more significant 
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if we are measuring a rivet than a bridge. One way to account for the magnitudes of the 
quantities being evaluated is to normalize the error to the true value, as in 


true value — approximation 


True fractional relative error = 
true value 


The relative error can also be multiplied by 100% to express it as 


true value — approximation 


100% (4.3) 
true value 


t 


where e, designates the true percent relative error. 

For example, suppose that you have the task of measuring the lengths of a bridge and 
a rivet and come up with 9999 and 9 cm, respectively. If the true values are 10,000 and 
10 cm, respectively, the error in both cases is 1 cm. However, their percent relative errors 
can be computed using Eq. (4.3) as 0.01% and 10%, respectively. Thus, although both mea- 
surements have an absolute error of 1 cm, the relative error for the rivet is much greater. 
We would probably conclude that we have done an adequate job of measuring the bridge, 
whereas our estimate for the rivet leaves something to be desired. 

Notice that for Eqs. (4.2) and (4.3), E and e are subscripted with a f to signify that the 
error is based on the true value. For the example of the rivet and the bridge, we were pro- 
vided with this value. However, in actual situations such information is rarely available. For 
numerical methods, the true value will only be known when we deal with functions that can 
be solved analytically. Such will typically be the case when we investigate the theoretical 
behavior of a particular technique for simple systems. However, in real-world applications, 
we will obviously not know the true answer a priori. For these situations, an alternative 
is to normalize the error using the best available estimate of the true value—that is, to the 
approximation itself, as in 


approximate error 
EL 


7 —— 100% (4.4) 
approximation 


where the subscript a signifies that the error is normalized to an approximate value. Note 
also that for real-world applications, Eq. (4.2) cannot be used to calculate the error term 
in the numerator of Eq. (4.4). One of the challenges of numerical methods is to determine 
error estimates in the absence of knowledge regarding the true value. For example, certain 
numerical methods use iteration to compute answers. In such cases, a present approxima- 
tion is made on the basis of a previous approximation. This process is performed repeat- 
edly, or iteratively, to successively compute (hopefully) better and better approximations. 
For such cases, the error is often estimated as the difference between the previous and 
present approximations. Thus, percent relative error is determined according to 


present approximation — previous approximation 
E= EAT 
£ present approximation 


100% (4.5) 


This and other approaches for expressing errors is elaborated on in subsequent chapters. 
The signs of Eqs. (4.2) through (4.5) may be either positive or negative. If the approxi- 
mation is greater than the true value (or the previous approximation is greater than the current 
approximation), the error is negative; if the approximation is less than the true value, the error 
is positive. Also, for Eqs. (4.3) to (4.5), the denominator may be less than zero, which can 
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also lead to a negative error. Often, when performing computations, we may not be concerned 
with the sign of the error but are interested in whether the absolute value of the percent rela- 
tive error is lower than a prespecified tolerance £, Therefore, it is often useful to employ the 
absolute value of Eq. (4.5). For such cases, the computation is repeated until 


leal < E, (4.6) 


This relationship is referred to as a stopping criterion. If it is satisfied, our result is assumed 
to be within the prespecified acceptable level ¢,. Note that for the remainder of this text, we 
almost always employ absolute values when using relative errors. 

It is also convenient to relate these errors to the number of significant figures in the 
approximation. It can be shown (Scarborough, 1966) that if the following criterion is met, 
we can be assured that the result is correct to at least n significant figures. 


e, = (0.5 x 10°-")% (4.7) 


Error Estimates for Iterative Methods 


Problem Statement. In mathematics, functions can often be represented by infinite se- 
ries. For example, the exponential function can be computed using 
Feige eZ gage (B4.1.1) 
2 3! n! 
Thus, as more terms are added in sequence, the approximation becomes a better and better 
estimate of the true value of e*. Equation (E4.1.1) is called a Maclaurin series expansion. 
Starting with the simplest version, e* = 1, add terms one at a time in order to estimate 
e”, After each new term is added, compute the true and approximate percent relative errors 
with Eqs. (4.3) and (4.5), respectively. Note that the true value is e°° = 1.648721 . . . . Add 
terms until the absolute value of the approximate error estimate £, falls below a prespeci- 
fied error criterion £, conforming to three significant figures. 


Solution. First, Eq. (4.7) can be employed to determine the error criterion that ensures a 
result that is correct to at least three significant figures: 


e, = (0.5 x 10°)% = 0.05% 


Thus, we will add terms to the series until e, falls below this level. 
The first estimate is simply equal to Eq. (E4.1.1) with a single term. Thus, the first 
estimate is equal to 1. The second estimate is then generated by adding the second term as in 
e=1l+x 
or for x = 0.5 
e=14+05=15 


This represents a true percent relative error of [Eq. (4.3)] 


_ [1.648721 - 1.5 7 
y= a | x 100% = 9.02% 
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Equation (4.5) can be used to determine an approximate estimate of the error, as in 


e, = 5 x 100% = 33.3% 


Because £, is not less than the required value of ¢,, we would continue the computation by 
adding another term, x?/2!, and repeating the error calculations. The process is continued 
until |e,| < €, The entire computation can be summarized as 


Terms Result Ep % Eq % 
1 1 39.3 
2 1:5 9.02 33.3 
3 1.625 1.44 7.69 
4 1.645833333 0.175 1.27 
5 1.648437500 0.0172 0.158 
6 1.648697917 0.00142 0.0158 


Thus, after six terms are included, the approximate error falls below ¢,= 0.05%, and the 
computation is terminated. However, notice that, rather than three significant figures, the 
result is accurate to five! This is because, for this case, both Eqs. (4.5) and (4.7) are conser- 
vative. That is, they ensure that the result is at least as good as they specify. Although, this 
is not always the case for Eq. (4.5), it is true most of the time. 


4.1.3 Computer Algorithm for Iterative Calculations 


Many of the numerical methods described in the remainder of this text involve iterative 
calculations of the sort illustrated in Example 4.1. These all entail solving a mathematical 
problem by computing successive approximations to the solution starting from an initial 
guess. 

The computer implementation of such iterative solutions involves loops. As we saw in 
Sec. 3.3.2, these come in two basic flavors: count-controlled and decision loops. Most it- 
erative solutions use decision loops. Thus, rather than employing a prespecified number of 
iterations, the process typically is repeated until an approximate error estimate falls below 
a stopping criterion as in Example 4.1. 

To do this for the same problem as Example 4.1, the series expansion can be 
expressed as 


An M-file to implement this formula is shown in Fig. 4.2. The function is passed the value 
to be evaluated (x) along with a stopping error criterion (es) and a maximum allowable 
number of iterations (maxit). If the user omits either of the latter two parameters, the func- 
tion assigns default values. 
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function [fx,ea, iter] = IterMeth(x,es,maxit) 
% Maclaurin series of exponential function 

%  [fx,ea, iter] = IterMeth(x,es,maxit) 

% input: 

% x= value at which series evaluated 

% es = stopping criterion (default = 0.0001) 
% maxit = maximum iterations (default = 50) 
% output: 

% fx = estimated value 

% ea = approximate relative error (%) 

% iter = number of iterations 


% defaults: 
if nargin<2|isempty(es) ,es=0.0001; end 
if nargin<3| isempty(maxit) ,maxit=50;end 
% initialization 
iter = 1; sol = 1; ea = 100; 
% iterative calculation 
while (1) 
solold = sol; 
sol = sol + x “ iter / factorial(iter); 
iter = iter + 1; 


if sol~=0 
ea=abs((sol - solold)/sol)*100; 
end 
if ea<=es | iter>=maxit,break,end 
end 
fx = sol; 
end 
FIGURE 4.2 


An M-file to solve an iterative calculation. This example is set up to evaluate the Maclaurin 
series expansion for e* as described in Example 4.1. 


The function then initializes three variables: (a) iter, which keeps track of the number 
of iterations, (b) sol, which holds the current estimate of the solution, and (c) a variable, 
ea, which holds the approximate percent relative error. Note that ea is initially set to a value 
of 100 to ensure that the loop executes at least once. 

These initializations are followed by a decision loop that actually implements the 
iterative calculation. Prior to generating a new solution, the previous value, sol, is first 
assigned to solold. Then a new value of sol is computed and the iteration counter is incre- 
mented. If the new value of sol is nonzero, the percent relative error, ea, is determined. The 
stopping criteria are then tested. If both are false, the loop repeats. If either is true, the loop 
terminates and the final solution is sent back to the function call. 

When the M-file is implemented, it generates an estimate for the exponential func- 
tion which is returned along with the approximate error and the number of iterations. For 
example, e! can be evaluated as 


>> format long 
>> [approxval, ea, iter] = IterMeth(1,1e-6,100) 
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approxval = 
2 . 718281826198493 
ea = 
9.216155641522974e -007 
iter = 
12 
We can see that after 12 iterations, we obtain a result of 2.7182818 with an approx- 
imate error estimate of = 9.2162 x 107%. The result can be verified by using the built- 
in exp function to directly calculate the exact value and the true percent relative error, 


>> trueval=exp(1) 


trueval = 
2.718281828459046 


>> et=abs((trueval- approxval)/trueval)*100 
et = 
8. 316108397236229e - 008 


As was the case with Example 4.1, we obtain the desirable outcome that the true error is 
less than the approximate error. 


ROUNDOFF ERRORS 


Roundoff errors arise because digital computers cannot represent some quantities ex- 
actly. They are important to engineering and scientific problem solving because they 
can lead to erroneous results. In certain cases, they can actually lead to a calculation 
going unstable and yielding obviously erroneous results. Such calculations are said to 
be ill-conditioned. Worse still, they can lead to subtler discrepancies that are difficult 
to detect. 

There are two major facets of roundoff errors involved in numerical calculations: 


1. Digital computers have magnitude and precision limits on their ability to represent 
numbers. 

2. Certain numerical manipulations are highly sensitive to roundoff errors. This can re- 
sult from both mathematical considerations as well as from the way in which comput- 
ers perform arithmetic operations. 


4.2.1 Computer Number Representation 


Numerical roundoff errors are directly related to the manner in which numbers are stored 
in a computer. The fundamental unit whereby information is represented is called a word. 
This is an entity that consists of a string of binary digits, or bits. Numbers are typically 
stored in one or more words. To understand how this is accomplished, we must first review 
some material related to number systems. 

A number system is merely a convention for representing quantities. Because we 
have 10 fingers and 10 toes, the number system that we are most familiar with is the 
decimal, or base-10, number system. A base is the number used as the reference for 
constructing the system. The base-10 system uses the 10 digits—O, 1, 2, 3, 4, 5, 6, 7, 8, 
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and 9—to represent numbers. By themselves, these digits are satisfactory for counting 
from 0 to 9. 

For larger quantities, combinations of these basic digits are used, with the position or 
place value specifying the magnitude. The rightmost digit in a whole number represents a 
number from 0 to 9. The second digit from the right represents a multiple of 10. The third 
digit from the right represents a multiple of 100 and so on. For example, if we have the 
number 8642.9, then we have eight groups of 1000, six groups of 100, four groups of 10, 
two groups of 1, and nine groups of 0.1, or 


(8 x 10°) + (6 x 10°) + (4 x 10') + (2 x 10°) + (9 x 107") = 8642.9 


This type of representation is called positional notation. 

Now, because the decimal system is so familiar, it is not commonly realized that 
there are alternatives. For example, if human beings happened to have eight fingers and 
toes we would undoubtedly have developed an octal, or base-8, representation. In the 
same sense, our friend the computer is like a two-fingered animal who is limited to 
two states—either 0 or 1. This relates to the fact that the primary logic units of digital 
computers are on/off electronic components. Hence, numbers on the computer are rep- 
resented with a binary, or base-2, system. Just as with the decimal system, quantities 
can be represented using positional notation. For example, the binary number 101.1 is 
equivalent to (1 x 2°) + (0 x 2) + (1 x 29 + (1 x27) =44+041+05=5.5 inthe 
decimal system. 


Integer Representation. Now that we have reviewed how base-10 numbers can be rep- 
resented in binary form, it is simple to conceive of how integers are represented on a com- 
puter. The most straightforward approach, called the signed magnitude method, employs 
the first bit of a word to indicate the sign, with a 0 for positive and a | for negative. The 
remaining bits are used to store the number. For example, the integer value of 173 is rep- 
resented in binary as 10101101: 


(10101101), = 27 + 2° + 2 +2? + 2°= 128 + 324+84+4+41=(173),, 


Therefore, the binary equivalent of —173 would be stored on a 16-bit computer, as depicted 
in Fig. 4.3. 

If such a scheme is employed, there clearly is a limited range of integers that can be 
represented. Again assuming a 16-bit word size, if one bit is used for the sign, the 15 remain- 
ing bits can represent binary integers from O to 111111111111111. The upper limit can be 
converted to a decimal integer, as in (1 x 24) + (1 x2) +---+(1 x2!) + (1 x 2°) = 32,767. 


FIGURE 4.3 
The binary representation of the decimal integer —173 on a 16-bit computer using the signed 
magnitude method. 
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Note that this value can be simply evaluated as 2'° — 1. Thus, a 16-bit computer word can 
store decimal integers ranging from —32,767 to 32,767. 

In addition, because zero is already defined as 0000000000000000, it is redundant 
to use the number 1000000000000000 to define a “minus zero.” Therefore, it is conven- 
tionally employed to represent an additional negative number: —32,768, and the range is 
from —32,768 to 32,767. For an n-bit word, the range would be from —2""! to 2"! — 1. Thus, 
32-bit integers would range from —2, 147,483,648 to +2,147,483,647. 

Note that, although it provides a nice way to illustrate our point, the signed mag- 
nitude method is not actually used to represent integers for conventional computers. A 
preferred approach called the 2s complement technique directly incorporates the sign into 
the number’s magnitude rather than providing a separate bit to represent plus or minus. 
Regardless, the range of numbers is still the same as for the signed magnitude method 
described above. 

The foregoing serves to illustrate how all digital computers are limited in their capability 
to represent integers. That is, numbers above or below the range cannot be represented. A 
more serious limitation is encountered in the storage and manipulation of fractional quanti- 
ties as described next. 


Floating-Point Representation. Fractional quantities are typically represented in com- 
puters using floating-point format. In this approach, which is very much like scientific 
notation, the number is expressed as 


+s X b° 


where s = the significand (or mantissa), b = the base of the number system being used, and 
e = the exponent. 

Prior to being expressed in this form, the number is normalized by moving the decimal 
place over so that only one significant digit is to the left of the decimal point. This is done so 
computer memory is not wasted on storing useless nonsignificant zeros. For example, a value 
like 0.005678 could be represented in a wasteful manner as 0.005678 x 10°. However, nor- 
malization would yield 5.678 x 107° which eliminates the useless zeroes. 

Before describing the base-2 implementation used on computers, we will first ex- 
plore the fundamental implications of such floating-point representation. In particular, 
what are the ramifications of the fact that in order to be stored in the computer, both 
the mantissa and the exponent must be limited to a finite number of bits? As in the 
next example, a nice way to do this is within the context of our more familiar base-10 
decimal world. 


Implications of Floating-Point Representation 


Problem Statement. Suppose that we had a hypothetical base-10 computer with a 5-digit 
word size. Assume that one digit is used for the sign, two for the exponent, and two for the 
mantissa. For simplicity, assume that one of the exponent digits is used for its sign, leaving 
a single digit for its magnitude. 


Solution. A general representation of the number following normalization would be 


s,d,.d, X 10%% 
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where sọ and s, = the signs, dọ = the magnitude of the exponent, and d, and d, = the mag- 
nitude of the significand digits. 

Now, let’s play with this system. First, what is the largest possible positive quantity 
that can be represented? Clearly, it would correspond to both signs being positive and all 
magnitude digits set to the largest possible value in base-10, that is, 9: 


Largest value = +9.9 x 10°? 


So the largest possible number would be a little less than 10 billion. Although this 
might seem like a big number, it’s really not that big. For example, this computer 
would be incapable of representing a commonly used constant like Avogadro’s number 
(6.022 x 107°). 

In the same sense, the smallest possible positive number would be 


Smallest value = +1.0 x 107° 


Again, although this value might seem pretty small, you could not use it to represent a 
quantity like Planck’s constant (6.626 x 107% J - s). 

Similar negative values could also be developed. The resulting ranges are displayed 
in Fig. 4.4. Large positive and negative numbers that fall outside the range would cause an 
overflow error. In a similar sense, for very small quantities there is a “hole” at zero, and 
very small quantities would usually be converted to zero. 

Recognize that the exponent overwhelmingly determines these range limitations. For 
example, if we increase the mantissa by one digit, the maximum value increases slightly to 
9.99 x 10°. In contrast, a one-digit increase in the exponent raises the maximum by 90 orders 
of magnitude to 9.9 x 10”! 

When it comes to precision, however, the situation is reversed. Whereas the significand 
plays a minor role in defining the range, it has a profound effect on specifying the precision. 
This is dramatically illustrated for this example where we have limited the significand to 
only 2 digits. As in Fig. 4.5, just as there is a “hole” at zero, there are also “holes” between 
values. 

For example, a simple rational number with a finite number of digits like 2° = 0.03125 
would have to be stored as 3.1 x 107? or 0.031. Thus, a roundoff error is introduced. For this 
case, it represents a relative error of 


0.03125 — 0.031 _ 
0.03125 008 


FIGURE 4.4 
The number line showing the possible ranges corresponding to the hypothetical base-10 
floating-point scheme described in Example 4.2. 


Minimum Smallest Maximum 
-9.9 x 10° —1.0 x 10-9 1.0 x 10-9 9.9 x 10° 
Overflow c Underflow k< Overflow 
a 


“Hole” at zero 
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0.01 0.1 
0.98 0.99 1 1.1 1.2 
FIGURE 4.5 


A small portion of the number line corresponding to the hypothetical base-10 floating-point 
scheme described in Example 4.2. The numbers indicate values that can be represented 
exactly. All other quantities falling in the “holes” between these values would exhibit some 
roundoff error. 


While we could store a number like 0.03125 exactly by expanding the digits of the 
significand, quantities with infinite digits must always be approximated. For example, a 
commonly used constant such as 2 (= 3.14159...) would have to be represented as 3.1 x 10° 
or 3.1. For this case, the relative error is 


3.14159 — 3.1 _ 
3.14159 oe 


Although adding significand digits can improve the approximation, such quantities will 
always have some roundoff error when stored in a computer. 

Another more subtle effect of floating-point representation is illustrated by Fig. 4.5. 
Notice how the interval between numbers increases as we move between orders of mag- 
nitude. For numbers with an exponent of —1 (i.e., between 0.1 and 1), the spacing is 0.01. 
Once we cross over into the range from 1 to 10, the spacing increases to 0.1. This means 
that the roundoff error of a number will be proportional to its magnitude. In addition, 
it means that the relative error will have an upper bound. For this example, the maxi- 
mum relative error would be 0.05. This value is called the machine epsilon (or machine 
precision). 


As illustrated in Example 4.2, the fact that both the exponent and significand are finite 
means that there are both range and precision limits on floating-point representation. Now, 
let us examine how floating-point quantities are actually represented in a real computer 
using base-2 or binary numbers. 

First, let’s look at normalization. Since binary numbers consist exclusively of Os and 
1s, a bonus occurs when they are normalized. That is, the bit to the left of the binary point 
will always be one! This means that this leading bit does not have to be stored. Hence, 
nonzero binary floating-point numbers can be expressed as 


+(1 +f) x 2° 


where f = the mantissa (i.e., the fractional part of the significand). For example, if we nor- 
malized the binary number 1101.1, the result would be 1.1011 x (2) or (1 +0.1011) x 27°. 
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exponent Mantissa 
11 bits 52 bits 
Sign 
(1 bit) 
FIGURE 4.6 


The manner in which a floating-point number is stored in an 8-byte word in IEEE double- 
precision format. 


Thus, although the original number has five significant bits, we only have to store the four 
fractional bits: 0.1011. 

By default, MATLAB has adopted the JEEE double-precision format in which eight 
bytes (64 bits) are used to represent floating-point numbers. As in Fig. 4.6, one bit is re- 
served for the number’s sign. In a similar spirit to the way in which integers are stored, the 
exponent and its sign are stored in 11 bits. Finally, 52 bits are set aside for the mantissa. 
However, because of normalization, 53 bits can be stored. 

Now, just as in Example 4.2, this means that the numbers will have a limited range and 
precision. However, because the IEEE format uses many more bits, the resulting number 
system can be used for practical purposes. 


Range. In a fashion similar to the way in which integers are stored, the 11 bits used for 
the exponent translates into a range from —1022 to 1023. The largest positive number can 
be represented in binary as 


Largest value = +1.1111... 1111 x 271 


where the 52 bits in the mantissa are all 1. Since the significand is approximately 2 (it is 
actually 2 — 9-2), the largest value is therefore 2104 — 1.7977 x 10°°°. In a similar fashion, 
the smallest positive number can be represented as 


Smallest value = +1.0000 . . .0000 x 271° 


This value can be translated into a base-10 value of 27!°7 = 2.2251 x 10°. 


Precision. The 52 bits used for the mantissa correspond to about 15 to 16 base-10 digits. 
Thus, z would be expressed as 


>> format long 
>> pi 


ans = 
3.14159265358979 


Note that the machine epsilon is 2” = 2.2204 x 10". 
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MATLAB has a number of built-in functions related to its internal number representa- 
tion. For example, the realmax function displays the largest positive real number: 


>> format long 
>> realmax 


ans = 
1.797693134862316e + 308 


Numbers occurring in computations that exceed this value create an overflow. In 
MATLAB they are set to infinity, inf. The realmin function displays the smallest positive 
real number: 


>> realmin 


ans = 
2.225073858507201e - 308 


Numbers that are smaller than this value create an underflow and, in MATLAB, are set to 
zero. Finally, the eps function displays the machine epsilon: 


>> eps 


ans = 
2.220446049250313e - 016 


4.2.2 Arithmetic Manipulations of Computer Numbers 


Aside from the limitations of a computer’s number system, the actual arithmetic manipu- 
lations involving these numbers can also result in roundoff error. To understand how this 
occurs, let’s look at how the computer performs simple addition and subtraction. 

Because of their familiarity, normalized base-10 numbers will be employed to illus- 
trate the effect of roundoff errors on simple addition and subtraction. Other number bases 
would behave in a similar fashion. To simplify the discussion, we will employ a hypotheti- 
cal decimal computer with a 4-digit mantissa and a 1-digit exponent. 

When two floating-point numbers are added, the numbers are first expressed so that 
they have the same exponents. For example, if we want to add 1.557 + 0.04341, the com- 
puter would express the numbers as 0.1557 x 10! + 0.004341 x 10'. Then the mantissas 
are added to give 0.160041 x 10'. Now, because this hypothetical computer only carries a 
4-digit mantissa, the excess number of digits get chopped off and the result is 0.1600 x 101. 
Notice how the last two digits of the second number (41) that were shifted to the right have 
essentially been lost from the computation. 

Subtraction is performed identically to addition except that the sign of the subtrahend 
is reversed. For example, suppose that we are subtracting 26.86 from 36.41. That is, 


0.3641 x 10? 
=0.2686 x 107 
0.0955 x 10? 
For this case the result must be normalized because the leading zero is unneces- 


sary. So we must shift the decimal one place to the right to give 0.9550 x 10’ = 9.550. 
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Notice that the zero added to the end of the mantissa is not significant but is merely ap- 
pended to fill the empty space created by the shift. Even more dramatic results would be 
obtained when the numbers are very close as in 


0.7642 x 10° 
—0.7641 x 10° 
0.0001 x 10° 


which would be converted to 0.1000 x 10° = 0.1000. Thus, for this case, three nonsig- 
nificant zeros are appended. 

The subtracting of two nearly equal numbers is called subtractive cancellation. It is 
the classic example of how the manner in which computers handle mathematics can lead to 
numerical problems. Other calculations that can cause problems include: 


Large Computations. Certain methods require extremely large numbers of arithmetic 
manipulations to arrive at their final results. In addition, these computations are often inter- 
dependent. That is, the later calculations are dependent on the results of earlier ones. Con- 
sequently, even though an individual roundoff error could be small, the cumulative effect 
over the course of a large computation can be significant. A very simple case involves 
summing a round base-10 number that is not round in base-2. Suppose that the following 
M-file is constructed: 


function sout = sumdemo( ) 


s = 0; 

for i = 1:10000 
s = s + 0.0001; 

end 

sout = s; 


When this function is executed, the result is 

>> format long 

>> sumdemo 

ans = 

0.99999999999991 

The format long command lets us see the 15 significant-digit representation used by 
MATLAB. You would expect that sum would be equal to 1. However, although 0.0001 is 
a nice round number in base-10, it cannot be expressed exactly in base-2. Thus, the sum 
comes out to be slightly different than 1. We should note that MATLAB has features that 
are designed to minimize such errors. For example, suppose that you form a vector as in 

>> format long 

>> s = [0:0.0001:1]; 
For this case, rather than being equal to 0.99999999999991, the last entry will be exactly 
one as verified by 

>> s(10001) 


ans = 
1 
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4.3 


Adding a Large and a Small Number. Suppose we add a small number, 0.0010, to a 
large number, 4000, using a hypothetical computer with the 4-digit mantissa and the 1-digit 
exponent. After modifying the smaller number so that its exponent matches the larger, 


0.4000 =x 10* 
0.0000001 x 10* 


0.4000001 x 10* 


which is chopped to 0.4000 x 10*. Thus, we might as well have not performed the addition! 
This type of error can occur in the computation of an infinite series. The initial terms in 
such series are often relatively large in comparison with the later terms. Thus, after a few 
terms have been added, we are in the situation of adding a small quantity to a large quan- 
tity. One way to mitigate this type of error is to sum the series in reverse order. In this way, 
each new term will be of comparable magnitude to the accumulated sum. 


Smearing. Smearing occurs whenever the individual terms in a summation are larger than 
the summation itself. One case where this occurs is in a series of mixed signs. 


Inner Products. As should be clear from the last sections, some infinite series are par- 
ticularly prone to roundoff error. Fortunately, the calculation of series is not one of the 
more common operations in numerical methods. A far more ubiquitous manipulation is the 
calculation of inner products as in 


n 
DY =X MMH THY, 
i=1 


This operation is very common, particularly in the solution of simultaneous linear algebraic 
equations. Such summations are prone to roundoff error. Consequently, it is often desirable to 
compute such summations in double precision as is done automatically in MATLAB. 


TRUNCATION ERRORS 


Truncation errors are those that result from using an approximation in place of an exact 
mathematical procedure. For example, in Chap. 1 we approximated the derivative of veloc- 
ity of a bungee jumper by a finite-difference equation of the form [Eq. (1.11)] 


dv ~ Av _ v(t.) — V(t;) 
dt~ At lh 
A truncation error was introduced into the numerical solution because the difference equa- 
tion only approximates the true value of the derivative (recall Fig. 1.3). To gain insight into 
the properties of such errors, we now turn to a mathematical formulation that is used widely 
in numerical methods to express functions in an approximate fashion—the Taylor series. 


(4.8) 


4.3.1 The Taylor Series 


Taylor’s theorem and its associated formula, the Taylor series, is of great value in the study 
of numerical methods. In essence, the Taylor theorem states that any smooth function can 
be approximated as a polynomial. The Taylor series then provides a means to express this 
idea mathematically in a form that can be used to generate practical results. 
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FIGURE 4.7 


The approximation of f(x) = —0.1x* — 0.15x? — 0.5x? — 0.25x + 1.2 at x = 1 by 
zero-order, first-order, and second-order Taylor series expansions. 


A useful way to gain insight into the Taylor series is to build it term by term. A good 
problem context for this exercise is to predict a function value at one point in terms of the 
function value and its derivatives at another point. 

Suppose that you are blindfolded and taken to a location on the side of a hill facing 
downslope (Fig. 4.7). We’ll call your horizontal location x; and your vertical distance with 
respect to the base of the hill f (x,). You are given the task of predicting the height at a 
position x,,,, which is a distance h away from you. 

At first, you are placed on a platform that is completely horizontal so that you have no 
idea that the hill is sloping down away from you. At this point, what would be your best 
guess at the height at x,,,? If you think about it (remember you have no idea whatsoever 
what’s in front of you), the best guess would be the same height as where you’re standing 
now! You could express this prediction mathematically as 


fn) = fp (4.9) 


This relationship, which is called the zero-order approximation, indicates that the value 
of f at the new point is the same as the value at the old point. This result makes intuitive 
sense because if x; and x,,, are close to each other, it is likely that the new value is probably 
similar to the old value. 

Equation (4.9) provides a perfect estimate if the function being approximated is, in 
fact, a constant. For our problem, you would be right only if you happened to be standing 
on a perfectly flat plateau. However, if the function changes at all over the interval, addi- 
tional terms of the Taylor series are required to provide a better estimate. 

So now you are allowed to get off the platform and stand on the hill surface with one 
leg positioned in front of you and the other behind. You immediately sense that the front 
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foot is lower than the back foot. In fact, you’re allowed to obtain a quantitative estimate of 
the slope by measuring the difference in elevation and dividing it by the distance between 
your feet. 

With this additional information, you're clearly in a better position to predict the height 
at f (x;,,). In essence, you use the slope estimate to project a straight line out to x,,,. You can 
express this prediction mathematically by 


ff Onn =f) +f Ah (4.10) 


This is called a first-order approximation because the additional first-order term consists 
of a slope f’(x,) multiplied by h, the distance between x, and x,,,. Thus, the expression is 
now in the form of a straight line that is capable of predicting an increase or decrease of the 
function between x; and x;,.. 

Although Eq. (4.10) can predict a change, it is only exact for a straight-line, or linear, 
trend. To get a better prediction, we need to add more terms to our equation. So now you 
are allowed to stand on the hill surface and take two measurements. First, you measure the 
slope behind you by keeping one foot planted at x; and moving the other one back a distance 
Ax. Let’s call this slope f;(x;). Then you measure the slope in front of you by keeping one 
foot planted at x; and moving the other one forward Ax. Let’s call this slope f/(x,). You im- 
mediately recognize that the slope behind is milder than the one in front. Clearly the drop 
in height is “accelerating” downward in front of you. Thus, the odds are that f (x;) is even 
lower than your previous linear prediction. 

As you might expect, you’re now going to add a second-order term to your equation 
and make it into a parabola. The Taylor series provides the correct way to do this as in 


F 
21 
To make use of this formula, you need an estimate of the second derivative. You can use 
the last two slopes you determined to estimate it as 
Fix) — fo) 
Ax 


fia) SfE +f Eh + h? (4.11) 


F'O (4.12) 
Thus, the second derivative is merely a derivative of a derivative; in this case, the rate of 
change of the slope. 

Before proceeding, let’s look carefully at Eq. (4.11). Recognize that all the values 
subscripted i represent values that you have estimated. That is, they are numbers. Con- 
sequently, the only unknowns are the values at the prediction position x,,,. Thus, it is a 
quadratic equation of the form 


fal’ +ah+ ay 


Thus, we can see that the second-order Taylor series approximates the function with a 
second-order polynomial. 

Clearly, we could keep adding more derivatives to capture more of the function’s cur- 
vature. Thus, we arrive at the complete Taylor series expansion 


Fœ) 2 FOR) 3 FPR) 
z1 hk + 31 es ar a 


Fn) =f) +f GDA + h’ +R, (4.13) 
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Note that because Eq. (4.13) is an infinite series, an equal sign replaces the approxi- 
mate sign that was used in Eqs. (4.9) through (4.11). A remainder term is also included to 
account for all terms from n + 1 to infinity: 


JE hrt! 


REF 


(4.14) 
where the subscript n connotes that this is the remainder for the nth-order approximation 
and € is a value of x that lies somewhere between x; and x,, ;. 

We can now see why the Taylor theorem states that any smooth function can be ap- 
proximated as a polynomial and that the Taylor series provides a means to express this idea 
mathematically. 

In general, the nth-order Taylor series expansion will be exact for an nth-order poly- 
nomial. For other differentiable and continuous functions, such as exponentials and sinu- 
soids, a finite number of terms will not yield an exact estimate. Each additional term will 
contribute some improvement, however slight, to the approximation. This behavior will be 
demonstrated in Example 4.3. Only if an infinite number of terms are added will the series 
yield an exact result. 

Although the foregoing is true, the practical value of Taylor series expansions is 
that, in most cases, the inclusion of only a few terms will result in an approximation that 
is close enough to the true value for practical purposes. The assessment of how many 
terms are required to get “close enough” is based on the remainder term of the expansion 
(Eq. 4.14). This relationship has two major drawbacks. First, € is not known exactly but 
merely lies somewhere between x, and x;,,,. Second, to evaluate Eq. (4.14), we need to 
determine the (n + 1)th derivative of f(x). To do this, we need to know f(x). However, 
if we knew f(x), there would be no need to perform the Taylor series expansion in the 
present context! 

Despite this dilemma, Eq. (4.14) is still useful for gaining insight into truncation 
errors. This is because we do have control over the term h in the equation. In other 
words, we can choose how far away from x we want to evaluate f (x), and we can con- 
trol the number of terms we include in the expansion. Consequently, Eq. (4.14) is often 
expressed as 


R, = O(h") 


where the nomenclature O(h"*!) means that the truncation error is of the order of h”*!. 
That is, the error is proportional to the step size h raised to the (n + 1)th power. Al- 
though this approximation implies nothing regarding the magnitude of the derivatives 
that multiply h"*', it is extremely useful in judging the comparative error of numerical 
methods based on Taylor series expansions. For example, if the error is O(h), halving 
the step size will halve the error. On the other hand, if the error is O(h’), halving the 
step size will quarter the error. 

In general, we can usually assume that the truncation error is decreased by the addi- 
tion of terms to the Taylor series. In many cases, if h is sufficiently small, the first- and 
other lower-order terms usually account for a disproportionately high percent of the error. 
Thus, only a few terms are required to obtain an adequate approximation. This property is 
illustrated by the following example. 
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EXAMPLE 4.3 


Approximation of a Function with a Taylor Series Expansion 


Problem Statement. Use Taylor series expansions with n = 0 to 6 to approximate f(x) = 
cos x at x;,, = 2/3 on the basis of the value of f(x) and its derivatives at x; = 2/4. Note that 
this means that h = 2/3 — 2/4 = 7/12. 


Solution. Our knowledge of the true function allows us to determine the correct value 
f(a/3) = 0.5. The zero-order approximation is [Eq. (4.9)] 


73) ~ cos ( ) = (.707106781 


AIN 


which represents a percent relative error of 


pE es = 707106781 100% = 41.4% 


For the first-order approximation, we add the first derivative term where f’ (x) = —sin x: 
T\ax ET EAS 
f(3)= 00s ( ) sin valve 0.521986659 


which has |e,| = 0.40%. For the second-order approximation, we add the second derivative 
term where f” (x) = —cos x: 


f(F) = cos (4) — sin (7) (G5) - = ee (Sy = 0.497754491 


with |e,| = 0.449%. Thus, the inclusion of additional terms results in an improved estimate. 
The process can be continued and the results listed as in 


Order n f™@) Sf (2/3) le; 

0 cos X 0.707106781 41.4 

1 -sin x 0.521986659 4.40 

2 —cos X 0.497754491 0.449 

3 sin x 0.499869147 2.62 x 10? 
4 cos X 0.500007551 1.51 x 1073 
5 -sin x 0.500000304 6.08 x 10-5 
6 —cos X 0.499999988 2.44 x 10° 


Notice that the derivatives never go to zero as would be the case for a polynomial. 
Therefore, each additional term results in some improvement in the estimate. However, 
also notice how most of the improvement comes with the initial terms. For this case, by the 
time we have added the third-order term, the error is reduced to 0.026%, which means that 
we have attained 99.974% of the true value. Consequently, although the addition of more 
terms will reduce the error further, the improvement becomes negligible. 
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4.3.2 The Remainder for the Taylor Series Expansion 


Before demonstrating how the Taylor series is actually used to estimate numerical errors, 
we must explain why we included the argument € in Eq. (4.14). To do this, we will use a 
simple, visually based explanation. 

Suppose that we truncated the Taylor series expansion [Eq. (4.13)] after the zero-order 
term to yield 


f Xian) S f) 

A visual depiction of this zero-order prediction is shown in Fig. 4.8. The remainder, or 
error, of this prediction, which is also shown in the illustration, consists of the infinite 
series of terms that were truncated 

a 3 
f œ pal (x) 3 

2! 3! 
It is obviously inconvenient to deal with the remainder in this infinite series format. One 
simplification might be to truncate the remainder itself, as in 


Ry = f' Gh (4.15) 


Although, as stated in the previous section, lower-order derivatives usually account for a 
greater share of the remainder than the higher-order terms, this result is still inexact be- 
cause of the neglected second- and higher-order terms. This “‘inexactness” is implied by the 
approximate equality symbol (=) employed in Eq. (4.15). 

An alternative simplification that transforms the approximation into an equivalence is 
based on a graphical insight. As in Fig. 4.9, the derivative mean-value theorem states that 
if a function f(x) and its first derivative are continuous over an interval from x; to x;,,, then 


R =f a+ foes 


i+) 


FIGURE 4.8 
Graphical depiction of a zero-order Taylor series prediction and remainder. 
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FIGURE 4.9 
Graphical depiction of the derivative mean-value theorem. 


there exists at least one point on the function that has a slope, designated by f’(&), that is 
parallel to the line joining f(x) and f(x,,,). The parameter ë marks the x value where this 
slope occurs (Fig. 4.9). A physical illustration of this theorem is that, if you travel between 
two points with an average velocity, there will be at least one moment during the course of 
the trip when you will be moving at that average velocity. 

By invoking this theorem, it is simple to realize that, as illustrated in Fig. 4.9, the slope 
f’ (8) is equal to the rise Ry divided by the run A, or 


, _ Ro 
roa 
which can be rearranged to give 


Ro =f (8h (4.16) 
Thus, we have derived the zero-order version of Eq. (4.14). The higher-order versions 
are merely a logical extension of the reasoning used to derive Eq. (4.16). The first-order 
version is 
FE 2 
2! f 
For this case, the value of é conforms to the x value corresponding to the second derivative 
that makes Eq. (4.17) exact. Similar higher-order versions can be developed from Eq. (4.14). 


R= (4.17) 


4.3.3 Using the Taylor Series to Estimate Truncation Errors 


Although the Taylor series will be extremely useful in estimating truncation errors through- 
out this book, it may not be clear to you how the expansion can actually be applied to 
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numerical methods. In fact, we have already done so in our example of the bungee jumper. 
Recall that the objective of both Examples 1.1 and 1.2 was to predict velocity as a function 
of time. That is, we were interested in determining v(f). As specified by Eq. (4.13), v(t) can 
be expanded in a Taylor series: 


; v” (t;) 
Dlt) = OG) HD ENa h) + aT t) +R, 


Now let us truncate the series after the first derivative term: 
V(t 1) = VC) +o (tt, t) +R, (4.18) 


Equation (4.18) can be solved for 


v(t) = V(ti4,) — Vt) S R, (4.19) 
' iat T ti ti a t; 
First-order Truncation 
approximation error 


The first part of Eq. (4.19) is exactly the same relationship that was used to approximate the 
derivative in Example 1.2 [Eq. (1.11)]. However, because of the Taylor series approach, 
we have now obtained an estimate of the truncation error associated with this approxima- 
tion of the derivative. Using Eqs. (4.14) and (4.19) yields 

Ry _ v 


t; 17 to ai t;) 


or 


R, 
-770 at) 
Thus, the estimate of the derivative [Eq. (1.11) or the first part of Eq. (4.19)] has a trunca- 
tion error of order f,,, — t; In other words, the error of our derivative approximation should 
be proportional to the step size. Consequently, if we halve the step size, we would expect 
to halve the error of the derivative. 


4.3.4 Numerical Differentiation 


Equation (4.19) is given a formal label in numerical methods—it is called a finite differ- 
ence. It can be represented generally as 


fa a =a x) 


fœ) = + O (xa — x) (4.20) 


or 


Fa T SO) ©. om (4.21) 
where A is called the step size—that is, the length of the interval over which the approxima- 
tion is made, x,,, — x; It is termed a “forward” difference because it utilizes data at i and 


i+ 1 to estimate the derivative (Fig. 4.10a). 
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FIGURE 4.10 


Graphical depiction of (a) forward, (b) backward, and (c) centered finite-difference 
approximations of the first derivative. 
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This forward difference is but one of many that can be developed from the Taylor 
series to approximate derivatives numerically. For example, backward and centered dif- 
ference approximations of the first derivative can be developed in a fashion similar to the 
derivation of Eq. (4.19). The former utilizes values at x,_, and x, (Fig. 4.10b), whereas 
the latter uses values that are equally spaced around the point at which the derivative is 
estimated (Fig. 4.10c). More accurate approximations of the first derivative can be devel- 
oped by including higher-order terms of the Taylor series. Finally, all the foregoing versions 
can also be developed for second, third, and higher derivatives. The following sections pro- 
vide brief summaries illustrating how some of these cases are derived. 


Backward Difference Approximation of the First Derivative. The Taylor series can 
be expanded backward to calculate a previous value on the basis of a present value, as in 


fads- yh +P we (422) 


Truncating this equation after the first derivative and rearranging yields 


fœ =f i) 
h 


Pay a 


where the error is O(h). 


Centered Difference Approximation of the First Derivative. A third way to approxi- 
mate the first derivative is to subtract Eq. (4.22) from the forward Taylor series expansion: 


Fd =P) +P Gn + SP we (424) 


to yield 


(3) 
f Gin) =f Op + 2f’ Apa + of “ + 


which can be solved for 


f Gig) -fE Ea Piast 


iG)= Oh 


or 


fand) —f O- 2 
gg Oe) (4.25) 


f'Q) = 


Equation (4.25) is a centered finite difference representation of the first derivative. 
Notice that the truncation error is of the order of h? in contrast to the forward and backward 
approximations that were of the order of h. Consequently, the Taylor series analysis yields 
the practical information that the centered difference is a more accurate representation of 
the derivative (Fig. 4.10c). For example, if we halve the step size using a forward or back- 
ward difference, we would approximately halve the truncation error, whereas for the central 
difference, the error would be quartered. 
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EXAMPLE 4.4 


Finite-Difference Approximations of Derivatives 


Problem Statement. Use forward and backward difference approximations of O(h) and a 
centered difference approximation of O(h’) to estimate the first derivative of 


f@) =—0.1x4 — 0.15x7 — 0.5x? — 0.25x + 1.2 


at x = 0.5 using a step size h = 0.5. Repeat the computation using h = 0.25. Note that the 
derivative can be calculated directly as 


f'@ = —0.4x7 — 0.45x — 1.0x — 0.25 
and can be used to compute the true value as f’(0.5) = —0.9125. 
Solution. For h = 0.5, the function can be employed to determine 
X =0 Ff @_) = 1.2 
x,=05 f(@,)=0.925 
Xj4, = 1.0 fX) = 0.2 


These values can be used to compute the forward difference [Eq. (4.21)], 
£05) & 02-4298 =-145 le] = 58.9% 

the backward difference [Eq. (4.23)], 
f'0.5) & caer = —0.55 le,| = 39.7% 


and the centered difference [Eq. (4.25)], 


f'0.5) & eae = -1.0 le,| = 9.6% 
For h = 0.25, 
x, =025 fœ) = 1.10351563 
x, =0.5 fœ) = 0.925 


X1 = 0.75 f X1) = 0.636328 13 


which can be used to compute the forward difference, 


f'(0.5) & D.0s6078 13 = 0.925 Z 1.155 le,| = 26.5% 


the backward difference, 


£'(0.5) x 9929 = a 10351563 = -0.714 le,| = 21.7% 


and the centered difference, 


£05) & 063632813 =1.10331563 = —0.934 le,| = 2.4% 
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For both step sizes, the centered difference approximation is more accurate than for- 
ward or backward differences. Also, as predicted by the Taylor series analysis, halving 
the step size approximately halves the error of the backward and forward differences and 
quarters the error of the centered difference. 


4.4 


Finite-Difference Approximations of Higher Derivatives. Besides first derivatives, 
the Taylor series expansion can be used to derive numerical estimates of higher deriva- 
tives. To do this, we write a forward Taylor series expansion for f (x2) in terms of f (x;): 


F œx) 


En hP ++- (4.26) 


FfOn) =f) +F Dh) + 
Equation (4.24) can be multiplied by 2 and subtracted from Eq. (4.26) to give 
F Gina) — 2f ig) = — FO) +f" œh? Pets 


which can be solved for 


n (x; ) = 2 (x; ) + (x;) 

Fo sore r VTE) o (4.27) 
This relationship is called the second forward finite difference. Similar manipulations can 
be employed to derive a backward version 

7 FE) = 2f i) +F E) 

F= = * + O(h) 

A centered difference approximation for the second derivative can be derived by add- 
ing Eqs. (4.22) and (4.24) and rearranging the result to give 

7 F Oin) = 2f Gi) +f) 2 

Fa) = a = + O(h’) 

As was the case with the first-derivative approximations, the centered case is more accurate. 
Notice also that the centered version can be alternatively expressed as 


f Gin) -fE fœ = f xi) 
VA me h h 
J x) = h 
Thus, just as the second derivative is a derivative of a derivative, the second finite differ- 
ence approximation is a difference of two first finite differences [recall Eq. (4.12)]. 


TOTAL NUMERICAL ERROR 


The total numerical error is the summation of the truncation and roundoff errors. In general, 
the only way to minimize roundoff errors is to increase the number of significant figures 
of the computer. Further, we have noted that roundoff error may increase due to subtractive 
cancellation or due to an increase in the number of computations in an analysis. In contrast, 
Example 4.4 demonstrated that the truncation error can be reduced by decreasing the step 
size. Because a decrease in step size can lead to subtractive cancellation or to an increase 
in computations, the truncation errors are decreased as the roundoff errors are increased. 
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FIGURE 4.11 

A graphical depiction of the trade-off between roundoff and truncation error that sometimes 
comes into play in the course of a numerical method. The point of diminishing returns is 
shown, where roundoff error begins to negate the benefits of step-size reduction. 


Therefore, we are faced by the following dilemma: The strategy for decreasing one 
component of the total error leads to an increase of the other component. In a computa- 
tion, we could conceivably decrease the step size to minimize truncation errors only to 
discover that in doing so, the roundoff error begins to dominate the solution and the total 
error grows! Thus, our remedy becomes our problem (Fig. 4.11). One challenge that we 
face is to determine an appropriate step size for a particular computation. We would like to 
choose a large step size to decrease the amount of calculations and roundoff errors without 
incurring the penalty of a large truncation error. If the total error is as shown in Fig. 4.11, 
the challenge is to identify the point of diminishing returns where roundoff error begins to 
negate the benefits of step-size reduction. 

When using MATLAB, such situations are relatively uncommon because of its 15- to 16- 
digit precision. Nevertheless, they sometimes do occur and suggest a sort of “numerical un- 
certainty principle” that places an absolute limit on the accuracy that may be obtained using 
certain computerized numerical methods. We explore such a case in the following section. 


4.4.1 Error Analysis of Numerical Differentiation 


As described in Sec. 4.3.4, a centered difference approximation of the first derivative can 
be written as [Eq. (4.25)] 


, fan — f %-1) FPO 
Pay= 2h a ea 


(4.28) 
True Finite-difference Truncation 
value approximation error 


Thus, if the two function values in the numerator of the finite-difference approximation 
have no roundoff error, the only error is due to truncation. 
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EXAMPLE 4.5 


However, because we are using digital computers, the function values do include 
roundoff error as in 


f Qi) =fGa) + eii 
f Oa =F Gag) + Cin} 


where the f ’s are the rounded function values and the e’s are the associated roundoff errors. 
Substituting these values into Eq. (4.28) gives 


7 = i Gx) — f (%;,) Cnt = Eii A 2 
Pa)= 2h + oh o 
True Finite-difference Roundoff Truncation 
value approximation error error 


We can see that the total error of the finite-difference approximation consists of a 
roundoff error that decreases with step size and a truncation error that increases with 
step size. 

Assuming that the absolute value of each component of the roundoff error has an 
upper bound of e, the maximum possible value of the difference e;,, — e,_, will be 2e. Fur- 
ther, assume that the third derivative has a maximum absolute value of M. An upper bound 
on the absolute value of the total error can therefore be represented as 


Total error = | f’(x;) — PO Ly <£4 hM (4.29) 


h 6 


An optimal step size can be determined by differentiating Eq. (4.29), setting the result 
equal to zero and solving for 


_ 3/3€ 
Nop: = M (4.30) 


Roundoff and Truncation Errors in Numerical Differentiation 


Problem Statement. In Example 4.4, we used a centered difference approximation of 
O(h*) to estimate the first derivative of the following function at x = 0.5, 


Fœ = —0.1x4 — 0.15x3 — 0.5x2 — 0.25x + 1.2 


Perform the same computation starting with h = 1. Then progressively divide the step size 
by a factor of 10 to demonstrate how roundoff becomes dominant as the step size is reduced. 
Relate your results to Eq. (4.30). Recall that the true value of the derivative is —0.9125. 


Solution. We can develop the following M-file to perform the computations and plot the 
results. Notice that we pass both the function and its analytical derivative as arguments: 
function diffex(func ,dfunc,x,n) 


format long 
dftrue=dfunc(x) ; 


h=1; 
H(1)=h; 
D(1)=(func(x+h) -func(x-h) )/(2*h) ; 


abs(dftrue-D(1)); 
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for i=2:n 
h=h/10; 
H(i)=h; 


D(i)= (func(x+h)-func(x-h))/(2*h) ; 
E(i 


)=abs(dftrue-D(i)); 


end 
L=[H' D' EJS; 


fprintf(' step size 


finite difference 


fprintf('%14.10f %16.14f %16.13f\n',L); 
loglog(H,E),xlabel('Step Size'),ylabel('Error') 
title('Plot of Error Versus Step Size') 


format short 


The M-file can then be run using the following commands: 


>> ff=@(x) -0.1*x^4-0.15*x^3-0.5*x^2-0.25*x+1.2; 
>> df=@(x) -0.4*x^3-0.45*x^2-x-0.25; 
>> diffex(ff,df,0.5,11) 


true error\n'); 


step size finite difference true error 
1.0000000000 -1.26250000000000 0.3500000000000 
0.1000000000 -0.91600000000000 0.0035000000000 
0.0100000000 -0.91253500000000 0.0000350000000 
0.0010000000 -0.91250035000001 0.0000003500000 
0.0001000000 -0.91250000349985 0.0000000034998 
0.0000100000 -0.91250000003318 0.0000000000332 
0.0000010000 -0.91250000000542 0.0000000000054 
0.0000001000 -0.91249999945031 0.0000000005497 
0.0000000100 -0.91250000333609 0.0000000033361 
0.0000000010 -0.91250001998944 0.0000000199894 
0.0000000001 -0.91250007550059 0.0000000755006 


As depicted in Fig. 4.12, the results are as expected. At first, roundoff is minimal 
and the estimate is dominated by truncation error. Hence, as in Eq. (4.29), the total error 
drops by a factor of 100 each time we divide the step by 10. However, starting at about 
h = 0.0001, we see roundoff error begin to creep in and erode the rate at which the error 
diminishes. A minimum error is reached at h = 10%. Beyond this point, the error increases 
as roundoff dominates. 

Because we are dealing with an easily differentiable function, we can also investigate 
whether these results are consistent with Eq. (4.30). First, we can estimate M by evaluating 
the function’s third derivative as 


M = |f®0.5)| = |-2.4(0.5) — 0.9| = 2.1 


Because MATLAB has a precision of about 15 to 16 base-10 digits, a rough estimate of 
the upper bound on roundoff would be about ¢ = 0.5 x 107!°. Substituting these values 
into Eq. (4.30) gives 


330.5 x 107") P 
hay = |r =43x10 


which is on the same order as the result of 1 x 10% obtained with MATLAB. 


4.4 TOTAL NUMERICAL ERROR 129 


FIGURE 4.12 


Plot of error versus step size 
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4.4.2 Control of Numerical Errors 


For most practical cases, we do not know the exact error associated with numerical methods. 
The exception, of course, is when we know the exact solution, which makes our numerical 
approximations unnecessary. Therefore, for most engineering and scientific applications 
we must settle for some estimate of the error in our calculations. 

There are no systematic and general approaches to evaluating numerical errors for all 
problems. In many cases error estimates are based on the experience and judgment of the 
engineer or scientist. 

Although error analysis is to a certain extent an art, there are several practical programming 
guidelines we can suggest. First and foremost, avoid subtracting two nearly equal numbers. Loss 
of significance almost always occurs when this is done. Sometimes you can rearrange or refor- 
mulate the problem to avoid subtractive cancellation. If this is not possible, you may want to use 
extended-precision arithmetic. Furthermore, when adding and subtracting numbers, it is best 
to sort the numbers and work with the smallest numbers first. This avoids loss of significance. 

Beyond these computational hints, one can attempt to predict total numerical errors using 
theoretical formulations. The Taylor series is our primary tool for analysis of such errors. Pre- 
diction of total numerical error is very complicated for even moderately sized problems and 
tends to be pessimistic. Therefore, it is usually attempted for only small-scale tasks. 
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4.5 


The tendency is to push forward with the numerical computations and try to estimate 
the accuracy of your results. This can sometimes be done by seeing if the results satisfy 
some condition or equation as a check. Or it may be possible to substitute the results back 
into the original equation to check that it is actually satisfied. 

Finally you should be prepared to perform numerical experiments to increase your 
awareness of computational errors and possible ill-conditioned problems. Such experi- 
ments may involve repeating the computations with a different step size or method and 
comparing the results. We may employ sensitivity analysis to see how our solution changes 
when we change model parameters or input values. We may want to try different numerical 
algorithms that have different theoretical foundations, are based on different computational 
strategies, or have different convergence properties and stability characteristics. 

When the results of numerical computations are extremely critical and may involve 
loss of human life or have severe economic ramifications, it is appropriate to take special 
precautions. This may involve the use of two or more independent groups to solve the same 
problem so that their results can be compared. 

The roles of errors will be a topic of concern and analysis in all sections of this book. 
We will leave these investigations to specific sections. 


BLUNDERS, MODEL ERRORS, AND DATA UNCERTAINTY 


Although the following sources of error are not directly connected with most of the nu- 
merical methods in this book, they can sometimes have great impact on the success of a 
modeling effort. Thus, they must always be kept in mind when applying numerical tech- 
niques in the context of real-world problems. 


4.5.1 Blunders 


Weare all familiar with gross errors, or blunders. In the early years of computers, erroneous nu- 
merical results could sometimes be attributed to malfunctions of the computer itself. Today, this 
source of error is highly unlikely, and most blunders must be attributed to human imperfection. 

Blunders can occur at any stage of the mathematical modeling process and can con- 
tribute to all the other components of error. They can be avoided only by sound knowledge 
of fundamental principles and by the care with which you approach and design your solu- 
tion to a problem. 

Blunders are usually disregarded in discussions of numerical methods. This is no doubt 
due to the fact that, try as we may, mistakes are to a certain extent unavoidable. However, 
we believe that there are a number of ways in which their occurrence can be minimized. In 
particular, the good programming habits that were outlined in Chap. 3 are extremely useful 
for mitigating programming blunders. In addition, there are usually simple ways to check 
whether a particular numerical method is working properly. Throughout this book, we dis- 
cuss ways to check the results of numerical calculations. 


4.5.2 Model Errors 


Model errors relate to bias that can be ascribed to incomplete mathematical models. An ex- 
ample of a negligible model error is the fact that Newton’s second law does not account for 
relativistic effects. This does not detract from the adequacy of the solution in Example 1.1 
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because these errors are minimal on the time and space scales associated with the bungee 
jumper problem. 

However, suppose that air resistance is not proportional to the square of the fall velocity, 
as in Eq. (1.7), but is related to velocity and other factors in a different way. If such were the 
case, both the analytical and numerical solutions obtained in Chap. | would be erroneous be- 
cause of model error. You should be cognizant of this type of error and realize that, if you are 
working with a poorly conceived model, no numerical method will provide adequate results. 


4.5.3 Data Uncertainty 


Errors sometimes enter into an analysis because of uncertainty in the physical data on which 
a model is based. For instance, suppose we wanted to test the bungee jumper model by 
having an individual make repeated jumps and then measuring his or her velocity after a 
specified time interval. Uncertainty would undoubtedly be associated with these measure- 
ments, as the parachutist would fall faster during some jumps than during others. These 
errors can exhibit both inaccuracy and imprecision. If our instruments consistently under- 
estimate or overestimate the velocity, we are dealing with an inaccurate, or biased, device. 
On the other hand, if the measurements are randomly high and low, we are dealing with a 
question of precision. 

Measurement errors can be quantified by summarizing the data with one or more well- 
chosen statistics that convey as much information as possible regarding specific character- 
istics of the data. These descriptive statistics are most often selected to represent (1) the 
location of the center of the distribution of the data and (2) the degree of spread of the data. 
As such, they provide a measure of the bias and imprecision, respectively. We will return to 
the topic of characterizing data uncertainty when we discuss regression in Part Four. 

Although you must be cognizant of blunders, model errors, and uncertain data, the nu- 
merical methods used for building models can be studied, for the most part, independently 
of these errors. Therefore, for most of this book, we will assume that we have not made gross 
errors, we have a sound model, and we are dealing with error-free measurements. Under 
these conditions, we can study numerical errors without complicating factors. 


PROBLEMS 


4.1 The “divide and average” method, an old-time method 
for approximating the square root of any positive number a, 
can be formulated as 


__ x+a/x 
com, 


Write a well-structured function to implement this algorithm 
based on the algorithm outlined in Fig. 4.2. 

4.2 Convert the following base-2 numbers to base 10: 
(a) 1011001, (b) 0.01011, and (c) 110.01001. 

4.3 Convert the following base-8 numbers to base 10: 
61,565 and 2.71. 

4.4 For computers, the machine epsilon € can also be 
thought of as the smallest number that when added to one 


gives a number greater than 1. An algorithm based on this 
idea can be developed as 


Step 1: Sete =1. 

Step 2: If 1+e is less than or equal to 1, then go to Step 5. 
Otherwise go to Step 3. 

Step 3: e€ = €/2 

Step 4: Return to Step 2 

Step 5: £=2 x€ 


Write your own M-file based on this algorithm to determine 
the machine epsilon. Validate the result by comparing it with 
the value computed with the built-in function eps. 

4.5 In a fashion similar to Prob. 4.4, develop your own 
M-file to determine the smallest positive real number used 
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in MATLAB. Base your algorithm on the notion that your 
computer will be unable to reliably distinguish between zero 
and a quantity that is smaller than this number. Note that 
the result you obtain will differ from the value computed 
with realmin. Challenge question: Investigate the results by 
taking the base-2 logarithm of the number generated by your 
code and those obtained with realmin. 

4.6 Although it is not commonly used, MATLAB allows 
numbers to be expressed in single precision. Each value 
is stored in 4 bytes with 1 bit for the sign, 23 bits for the 
mantissa, and 8 bits for the signed exponent. Determine the 
smallest and largest positive floating-point numbers as well 
as the machine epsilon for single precision representation. 
Note that the exponents range from —126 to 127. 

4.7 For the hypothetical base-10 computer in Example 4.2, 
prove that the machine epsilon is 0.05. 

4.8 The derivative of f(x) = 1/(1 — 3x’) is given by 


6x 
(1 — 3x°)? 
Do you expect to have difficulties evaluating this function 
at x = 0.577? Try it using 3- and 4-digit arithmetic with 
chopping. 
4.9 (a) Evaluate the polynomial 


y=x— 7x + 8x — 0.35 


at x = 1.37. Use 3-digit arithmetic with chopping. Evaluate 
the percent relative error. 
(b) Repeat (a) but express y as 


y = ((x — 7)x + 8)x — 0.35 


Evaluate the error and compare with part (a). 
4.10 The following infinite series can be used to approxi- 
mate e“: 


x yee ME 
e a Red Vea) 

(a) Prove that this Maclaurin series expansion is a special 
case of the Taylor series expansion (Eq. 4.13) with x; = 
Oandh=x. 

(b) Use the Taylor series to estimate f(x) = e™ at x,,, = 1 for 
x; = 0.25. Employ the zero-, first-, second-, and third- 
order versions and compute the |e,| for each case. 

4.11 The Maclaurin series expansion for cos x is 


Starting with the simplest version, cos x = 1, add terms 
one at a time to estimate cos(z/3). After each new term is 


added, compute the true and approximate percent relative er- 
rors. Use your calculator or MATLAB to determine the true 
value. Add terms until the absolute value of the approximate 
error estimate falls below an error criterion conforming to 
two significant figures. 

4.12 Perform the same computation as in Prob. 4.11, but 
use the Maclaurin series expansion for the sin x to estimate 
sin(z/3). 


4.13 Use zero- through third-order Taylor series expansions 
to predict f(3) for 


f(x) = 25x2 — 6x7 + 7x — 88 


using a base point at x = 1. Compute the true percent relative 
error for each approximation. 

4.14 Prove that Eq. (4.11) is exact for all values of x if 
f(x) = ax’ + bx +c. 

4.15 Use zero- through fourth-order Taylor series expan- 
sions to predict f(2) for f(x) = In x using a base point at x = 1. 
Compute the true percent relative error e, for each approxima- 
tion. Discuss the meaning of the results. 

4.16 Use forward and backward difference approximations 
of O(h) and a centered difference approximation of O(h’) 
to estimate the first derivative of the function examined in 
Prob. 4.13. Evaluate the derivative at x = 2 using a step size 
of h = 0.25. Compare your results with the true value of the 
derivative. Interpret your results on the basis of the remain- 
der term of the Taylor series expansion. 

4.17 Use a centered difference approximation of O(h?) to 
estimate the second derivative of the function examined in 
Prob. 4.13. Perform the evaluation at x = 2 using step sizes 
of h = 0.2 and 0.1. Compare your estimates with the true 
value of the second derivative. Interpret your results on the 
basis of the remainder term of the Taylor series expansion. 
4.18 If |x| < 1 it is known that 


— alext eet. 

1-x 

Repeat Prob. 4.11 for this series for x = 0.1. 

4.19 To calculate a planet’s space coordinates, we have to 


solve the function 
f@=x-1-0.5 sinx 


Let the base point be a = x, = 2/2 on the interval [0, z]. De- 
termine the highest-order Taylor series expansion resulting 
in a maximum error of 0.015 on the specified interval. The 
error is equal to the absolute value of the difference between 
the given function and the specific Taylor series expansion. 
(Hint: Solve graphically.) 
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4.20 Consider the function f(x) = xX — 2x + 4 on the inter- 
val [—2, 2] with h = 0.25. Use the forward, backward, and 
centered finite difference approximations for the first and 
second derivatives so as to graphically illustrate which ap- 
proximation is most accurate. Graph all three first-derivative 
finite difference approximations along with the theoretical, 
and do the same for the second derivative as well. 

4.21 Derive Eq. (4.30). 

4.22 Repeat Example 4.5, but for f(x) = cos(x) at x = 2/6. 
4.23 Repeat Example 4.5, but for the forward divided dif- 
ference (Eq. 4.21). 

4.24 One common instance where subtractive cancellation 
occurs involves finding the roots of a parabola, ax” + bx + c, 
with the quadratic formula: 


—b + Vb’ — 4ac 


2a 


x= 


For cases where b? >> 4ac, the difference in the numerator 
can be very small and roundoff errors can occur. In such 
cases, an alternative formulation can be used to minimize 
subtractive cancellation: 


= —2c_ 


b+ Vb? — 4ac 


Use 5-digit arithmetic with chopping to determine the roots 
of the following equation with both versions of the quadratic 
formula. 


x” — 5000.002x + 10 


4.25 Develop a well-structured MATLAB function to com- 
pute the Maclaurin series expansion for the cosine function 


as described in Prob. 4.11. Pattern your function after the 
one for the exponential function in Fig. 4.2. Test your pro- 
gram for 0 = 7/3 (60°) and 0 = 22 + a/3 = 77/3 (420°). 
Explain the difference in the number of iterations required 
to obtain the correct result with the desired approximate ab- 
solute error (€,). 

4.26 Develop a well-structured MATLAB function to com- 
pute the Maclaurin series expansion for the sine function as 
described in Prob. 4.12. Pattern your function after the one 
for the exponential function in Fig. 4.2. Test your program 
for 0 = x/3 (60°) and 0 = 27 + a/3 = 77/3 (420°). Explain 
the difference in the number of iterations required to ob- 
tain the correct result with the desired approximate absolute 
error (€,). 

4.27 Recall from your calculus class that the Maclaurin 
series, named after the Scottish mathematician Colin Ma- 
claurin (1698-1746), is a Taylor series expansion of a func- 
tion about 0. Use the Taylor series to derive the first four 
terms of the Maclaurin series expansion for the cosine em- 
ployed in Probs. 4.11 and 4.25. 

4.28 The Maclaurin series expansion for the arctangent of x 
is defined for Ixl < 1 as 


co 
~1)" 
arctan x = > ( ) ntl 


n= 

(a) Write out the first 4 terms (n = 0,...,3). 

(b) Starting with the simplest version, arctan x = x, add 
terms one at a time to estimate arctan(z/6). After each 
new term is added, compute the true and approximate 
percent relative errors. Use your calculator to determine 
the true value. Add terms until the absolute value of the 
approximate error estimate falls below an error criterion 
conforming to two significant figures. 
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Roots and Optimization 


OVERVIEW 


Years ago, you learned to use the quadratic formula 


ca bt Vb? = 4ac 


5 (PT2.1) 


to solve 
f@®)=axr+bx+c=0 (PT2.2) 


The values calculated with Eq. (PT2.1) are called the “roots” of Eq. (PT2.2). They repre- 
sent the values of x that make Eq. (PT2.2) equal to zero. For this reason, roots are some- 
times called the zeros of the equation. 

Although the quadratic formula is handy for solving Eq. (PT2.2), there are many other 
functions for which the root cannot be determined so easily. Before the advent of digi- 
tal computers, there were a number of ways to solve for the roots of such equations. For 
some cases, the roots could be obtained by direct methods, as with Eq. (PT2.1). Although 
there were equations like this that could 
be solved directly, there were many more 
that could not. In such instances, the only 
alternative is an approximate solution 
technique. 

One method to obtain an approxi- 
mate solution is to plot the function and 
determine where it crosses the x axis. 
This point, which represents the x value 
for which f(x) = 0, is the root. Although 
graphical methods are useful for obtain- 
ing rough estimates of roots, they are lim- 
ited because of their lack of precision. An 
alternative approach is to use trial and 
error. This “technique” consists of guess- 
ing a value of x and evaluating whether 
f(x) is zero. If not (as is almost always the 
case), another guess is made, and f(x) is 
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2.2 
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FIGURE PT2.1 
A function of a single variable illustrating the difference between roots and optima. 


again evaluated to determine whether the new value provides a better estimate of the root. 
The process is repeated until a guess results in an f(x) that is close to zero. 

Such haphazard methods are obviously inefficient and inadequate for the requirements 
of engineering and science practice. Numerical methods represent alternatives that are also 
approximate but employ systematic strategies to home in on the true root. As elaborated in 
the following pages, the combination of these systematic methods and computers makes 
the solution of most applied roots-of-equations problems a simple and efficient task. 

Besides roots, another feature of interest to engineers and scientists are a function’ s 
minimum and maximum values. The determination of such optimal values is referred to 
as optimization. As you learned in calculus, such solutions can be obtained analytically 
by determining the value at which the function is flat; that is, where its derivative is zero. 
Although such analytical solutions are sometimes feasible, most practical optimization prob- 
lems require numerical, computer solutions. From a numerical standpoint, such optimization 
methods are similar in spirit to the root-location methods we just discussed. That is, both 
involve guessing and searching for a location on a function. The fundamental difference be- 
tween the two types of problems is illustrated in Fig. PT2.1. Root location involves searching 
for the location where the function equals zero. In contrast, optimization involves searching 
for the function’s extreme points. 


PART ORGANIZATION 


The first two chapters in this part are devoted to root location. Chapter 5 focuses on brack- 
eting methods for finding roots. These methods start with guesses that bracket, or contain, 
the root and then systematically reduce the width of the bracket. Two specific methods are 
covered: bisection and false position. Graphical methods are used to provide visual insight 
into the techniques. Error formulations are developed to help you determine how much 
computational effort is required to estimate the root to a prespecified level of precision. 
Chapter 6 covers open methods. These methods also involve systematic trial-and-error 
iterations but do not require that the initial guesses bracket the root. We will discover that 
these methods are usually more computationally efficient than bracketing methods but 
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that they do not always work. We illustrate several open methods including the fixed-point 
iteration, Newton-Raphson, and secant methods. 

Following the description of these individual open methods, we then discuss a hybrid 
approach called Brent’s root-finding method that exhibits the reliability of the bracketing 
methods while exploiting the speed of the open methods. As such, it forms the basis for 
MATLAB’s root-finding function, fzero. After illustrating how fzero can be used for en- 
gineering and scientific problems solving, Chap. 6 ends with a brief discussion of special 
methods devoted to finding the roots of polynomials. In particular, we describe MATLAB’s 
excellent built-in capabilities for this task. 

Chapter 7 deals with optimization. First, we describe two bracketing methods, golden- 
section search and parabolic interpolation, for finding the optima of a function of a single 
variable. Then, we discuss a robust, hybrid approach that combines golden-section search 
and quadratic interpolation. This approach, which again is attributed to Brent, forms the 
basis for MATLAB’s one-dimensional root-finding function:fminbnd. After describing and 
illustrating fminbnd, the last part of the chapter provides a brief description of optimiza- 
tion of multidimensional functions. The emphasis is on describing and illustrating the use 
of MATLAB’s capability in this area: the fminsearch function. Finally, the chapter ends 
with an example of how MATLAB can be employed to solve optimization problems in 
engineering and science. 
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CHAPTER OBJECTIVES 


The primary objective of this chapter is to acquaint you with bracketing methods for 
finding the root of a single nonlinear equation. Specific objectives and topics covered are 


e Understanding what roots problems are and where they occur in engineering and 
science. 


Knowing how to determine a root graphically. 

Understanding the incremental search method and its shortcomings. 
Knowing how to solve a roots problem with the bisection method. 

Knowing how to estimate the error of bisection and why it differs from error 
estimates for other types of root-location algorithms. 

Understanding false position and how it differs from bisection. 


YOU’VE GOT A PROBLEM 


edical studies have established that a bungee jumper’s chances of sustaining a 
M significant vertebrae injury increase significantly if the free-fall velocity exceeds 
36 m/s after 4 s of free fall. Your boss at the bungee-jumping company wants 
you to determine the mass at which this criterion is exceeded given a drag coefficient of 
0.25 kg/m. 
You know from your previous studies that the following analytical solution can be 
used to predict fall velocity as a function of time: 


v(t) = 4/2 tanh (ye i (5.1) 


Try as you might, you cannot manipulate this equation to explicitly solve for m—that is, 
you cannot isolate the mass on the left side of the equation. 
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An alternative way of looking at the problem involves subtracting v(t) from both sides 
to give a new function: 


fim) = = tanh ( Ze 1) — vp (5.2) 


Now we can see that the answer to the problem is the value of m that makes the function 
equal to zero. Hence, we call this a “roots” problem. This chapter will introduce you to how 
the computer is used as a tool to obtain such solutions. 


ROOTS IN ENGINEERING AND SCIENCE 


Although they arise in other problem contexts, roots of equations frequently occur in the 
area of design. Table 5.1 lists a number of fundamental principles that are routinely used 
in design work. As introduced in Chap. 1, mathematical equations or models derived from 
these principles are employed to predict dependent variables as a function of independent 
variables, forcing functions, and parameters. Note that in each case, the dependent vari- 
ables reflect the state or performance of the system, whereas the parameters represent its 
properties or composition. 

An example of such a model is the equation for the bungee jumper’s velocity. If the 
parameters are known, Eq. (5.1) can be used to predict the jumper’s velocity. Such com- 
putations can be performed directly because v is expressed explicitly as a function of the 
model parameters. That is, it is isolated on one side of the equal sign. 

However, as posed at the start of the chapter, suppose that we had to determine the 
mass for a jumper with a given drag coefficient to attain a prescribed velocity in a set time 
period. Although Eq. (5.1) provides a mathematical representation of the interrelationship 
among the model variables and parameters, it cannot be solved explicitly for mass. In such 
cases, m is said to be implicit. 


TABLE 5.1 Fundamental principles used in design problems. 


Fundamental Dependent Independent 

Principle Variable Variable Parameters 

Heat balance Temperature Time and position Thermal properties of material, system geometry 

Mass balance Concentration or quantity Time and position Chemical behavior of material, mass transfer, 
of mass system geometry 

Force balance Magnitude and direction Time and position Strength of material, structural properties, system 
of forces geometry 

Energy balance Changes in kinetic and Time and position Thermal properties, mass of material, system 
potential energy geometry 

Newton’s laws of Acceleration, velocity, Time and position ass of material, system geometry, dissipative 

motion or location parameters 

Kirchhoffs laws Currents and voltages Time Electrical properties (resistance, capacitance, 


inductance) 
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5.2 


EXAMPLE 5.1 


This represents a real dilemma, because many design problems involve specifying the 
properties or composition of a system (as represented by its parameters) to ensure that it 
performs in a desired manner (as represented by its variables). Thus, these problems often 
require the determination of implicit parameters. 

The solution to the dilemma is provided by numerical methods for roots of equations. 
To solve the problem using numerical methods, it is conventional to reexpress Eq. (5.1) 
by subtracting the dependent variable v from both sides of the equation to give Eq. (5.2). 
The value of m that makes f(m) = 0 is, therefore, the root of the equation. This value also 
represents the mass that solves the design problem. 

The following pages deal with a variety of numerical and graphical methods for deter- 
mining roots of relationships such as Eq. (5.2). These techniques can be applied to many 
other problems confronted routinely in engineering and science. 


GRAPHICAL METHODS 


A simple method for obtaining an estimate of the root of the equation f(x) = 0 is to make 
a plot of the function and observe where it crosses the x axis. This point, which represents 
the x value for which f(x) = 0, provides a rough approximation of the root. 


The Graphical Approach 


Problem Statement. Use the graphical approach to determine the mass of the bungee 
jumper with a drag coefficient of 0.25 kg/m to have a velocity of 36 m/s after 4 s of free 
fall. Note: The acceleration of gravity is 9.81 m/s”. 


Solution. The following MATLAB session sets up a plot of Eq. (5.2) versus mass: 
>> cd = 0.25; g = 9.81; v = 36; t = 4; 

>> mp = linspace(50, 200) ; 

> fp = sqrt(g*mp/cd) .*tanh(sqrt(g*cd./mp)*t)-v; 

> plot(mp, fp) ,grid 


Vv 


Vv 


Root 


50 100 150 200 
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The function crosses the m axis between 140 and 150 kg. Visual inspection of the plot 
provides a rough estimate of the root of 145 kg (about 320 Ib). The validity of the graphical 
estimate can be checked by substituting it into Eq. (5.2) to yield 


>> sqrt(g*145/cd)*tanh(sqrt(g*cd/145)*t)-v 


ans = 
0.0456 


which is close to zero. It can also be checked by substituting it into Eq. (5.1) along with the 
parameter values from this example to give 
>> sqrt(g*145/cd)*tanh(sqrt(g*cd/145)*t) 


ans = 
36.0456 


which is close to the desired fall velocity of 36 m/s. 


5.3 


Graphical techniques are of limited practical value because they are not very precise. 
However, graphical methods can be utilized to obtain rough estimates of roots. These esti- 
mates can be employed as starting guesses for numerical methods discussed in this chapter. 

Aside from providing rough estimates of the root, graphical interpretations are useful 
for understanding the properties of the functions and anticipating the pitfalls of the numeri- 
cal methods. For example, Fig. 5.1 shows a number of ways in which roots can occur (or 
be absent) in an interval prescribed by a lower bound x, and an upper bound x, Figure 5.1b 
depicts the case where a single root is bracketed by negative and positive values of f(x). 
However, Fig. 5.1d, where f(x,) and f(x,) are also on opposite sides of the x axis, shows three 
roots occurring within the interval. In general, if f(x,) and f(x,,) have opposite signs, there 
are an odd number of roots in the interval. As indicated by Fig. 5.1a and c, if f(x,) and f(x,) 
have the same sign, there are either no roots or an even number of roots between the values. 

Although these generalizations are usually true, there are cases where they do not hold. 
For example, functions that are tangential to the x axis (Fig. 5.2a) and discontinuous func- 
tions (Fig. 5.25) can violate these principles. An example of a function that is tangential 
to the axis is the cubic equation f(x) = (x — 2)(x — 2)(x — 4). Notice that x = 2 makes two 
terms in this polynomial equal to zero. Mathematically, x = 2 is called a multiple root. 
Although they are beyond the scope of this book, there are special techniques that are 
expressly designed to locate multiple roots (Chapra and Canale, 2010). 

The existence of cases of the type depicted in Fig. 5.2 makes it difficult to develop 
foolproof computer algorithms guaranteed to locate all the roots in an interval. However, 
when used in conjunction with graphical approaches, the methods described in the fol- 
lowing sections are extremely useful for solving many problems confronted routinely by 
engineers, scientists, and applied mathematicians. 


BRACKETING METHODS AND INITIAL GUESSES 


If you had a roots problem in the days before computing, you’ d often be told to use “trial and 
error” to come up with the root. That is, you’d repeatedly make guesses until the function 
was sufficiently close to zero. The process was greatly facilitated by the advent of software 
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FIGURE 5.1 

Illustration of a number of general ways that a root may 
occur in an interval prescribed by a lower bound x, and 
an upper bound x, Parts (a) and (c) indicate that if both 
Kx) and fix) have the same sign, either there will be 
no roots or there will be an even number of roots within 
the interval. Parts (b) and (a) indicate that if the function 
has different signs at the end points, there will be an 
odd number of roots in the interval. 


FIGURE 5.2 
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tools such as spreadsheets. By allowing you to make many guesses rapidly, such tools can 
actually make the trial-and-error approach attractive for some problems. 

But, for many other problems, it is preferable to have methods that come up with the 
correct answer automatically. Interestingly, as with trial and error, these approaches require 
an initial “guess” to get started. Then they systematically home in on the root in an iterative 
fashion. 

The two major classes of methods available are distinguished by the type of initial 
guess. They are 


e Bracketing methods. As the name implies, these are based on two initial guesses that 
“bracket” the root—that is, are on either side of the root. 

e Open methods. These methods can involve one or more initial guesses, but there is no 
need for them to bracket the root. 


For well-posed problems, the bracketing methods always work but converge slowly 
(i.e., they typically take more iterations to home in on the answer). In contrast, the open 
methods do not always work (i.e., they can diverge), but when they do they usually con- 
verge quicker. 

In both cases, initial guesses are required. These may naturally arise from the physical 
context you are analyzing. However, in other cases, good initial guesses may not be obvi- 
ous. In such cases, automated approaches to obtain guesses would be useful. The following 
section describes one such approach, the incremental search. 


5.3.1 Incremental Search 


When applying the graphical technique in Example 5.1, you observed that f(x) changed 
sign on opposite sides of the root. In general, if f(x) is real and continuous in the interval 
from x, to x, and f(x,) and f(x,) have opposite signs, that is, 


Sap) f@,) <0 (5.3) 


then there is at least one real root between x, and x,. 

Incremental search methods capitalize on this observation by locating an interval 
where the function changes sign. A potential problem with an incremental search is the 
choice of the increment length. If the length is too small, the search can be very time 
consuming. On the other hand, if the length is too great, there is a possibility that closely 
spaced roots might be missed (Fig. 5.3). The problem is compounded by the possible exis- 
tence of multiple roots. 

An M-file can be developed! that implements an incremental search to locate the roots 
of a function func within the range from xmin to xmax (Fig. 5.4). An optional argument ns 
allows the user to specify the number of intervals within the range. If ns is omitted, it is 
automatically set to 50. A for loop is used to step through each interval. In the event that a 
sign change occurs, the upper and lower bounds are stored in an array xb. 


' This function is a modified version of an M-file originally presented by Recktenwald (2000). 


F(x) 4 


=Y 


FIGURE 5.3 

Cases where roots could be missed because the incremental length of the search procedure 
is too large. Note that the last root on the right is multiple and would be missed regardless of 
the increment length. 


function xb = incsearch(func,xmin,xmax,ns) 
% incsearch: incremental search root locator 
% xb = incsearch(func,xmin,xmax,ns): 


% finds brackets of x that contain sign changes 
% of a function on an interval 
% input: 


% func = name of function 
% xmin, xmax = endpoints of interval 

% ns = number of subintervals (default = 50) 

% output: 

%  xb(k,1) is the lower bound of the kth sign change 
%  xb(k,2) is the upper bound of the kth sign change 
% If no brackets found, xb = []. 


if nargin < 3, error('at least 3 arguments required'), end 
if nargin < 4, ns = 50; end %if ns blank set to 50 


% Incremental search 
x = linspace(xmin,xmax,ns); 
f = func(x); 
nb = 0; xb = []; %xb is null unless sign change detected 
for k = 1:length(x)-1 
if sign(f(k)) ~= sign(f(k+1)) %check for sign change 


nb = nb + 1; 
xb(nb,1) = x(k); 
xb(nb,2) = x(k+1); 
end 
end 


if isempty(xb) %display that no brackets were found 
disp('no brackets found') 
disp('check interval or increase ns') 

else 
disp('number of brackets:') %display number of brackets 
disp(nb) 

end 


FIGURE 5.4 
144 An M-file to implement an incremental search. 
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EXAMPLE 5.2 


Incremental Search 


Problem Statement. Use the M-file incsearch (Fig. 5.4) to identify brackets within the 
interval [3, 6] for the function: 


fœ = sin(10x) + cos(3x) (5.4) 


Solution. The MATLAB session using the default number of intervals (50) is 


>> incsearch(@(x) sin(10*x)+cos(3*x) ,3,6) 
number of brackets: 
5 

ans = 

3.2449 3.3061 

3.3061 3.3673 

3.7347 3.7959 

4.6531 4.7143 

5.6327 5.6939 


A plot of Eq. (5.4) along with the root locations is shown here. 


= 


3 35 4 4.5 5 55 6 


Although five sign changes are detected, because the subintervals are too wide, the func- 
tion misses possible roots at x = 4.25 and 5.2. These possible roots look like they might be 
double roots. However, by using the zoom in tool, it is clear that each represents two real 
roots that are very close together. The function can be run again with more subintervals 
with the result that all nine sign changes are located 
>> incsearch(@(x) sin(10*x)+cos(3*x),3,6,100) 
number of brackets: 
9 

ans = 

3.2424 3.2121 


3.3636 3.3939 
3.7273 3.7576 


146 ROOTS: BRACKETING METHODS 
4.2121 4.2424 
4.2424 4.2727 
4.6970 4.7273 
5.1515 5.1818 
5.1818 5.2121 
5.6667 5.6970 
2 
1 
(0) 
= 
—2 
3 315 4 4.5 5 5:5 6 
The foregoing example illustrates that brute-force methods such as incremental search 
are not foolproof. You would be wise to supplement such automatic techniques with any 
other information that provides insight into the location of the roots. Such information can 
be found by plotting the function and through understanding the physical problem from 
which the equation originated. 
5.4 BISECTION 
The bisection method is a variation of the incremental search method in which the interval 
is always divided in half. If a function changes sign over an interval, the function value at 
the midpoint is evaluated. The location of the root is then determined as lying within the 
subinterval where the sign change occurs. The subinterval then becomes the interval for 
the next iteration. The process is repeated until the root is known to the required precision. 
A graphical depiction of the method is provided in Fig. 5.5. The following example goes 
through the actual computations involved in the method. 
EXAMPLE 5.3 The Bisection Method 


Problem Statement. Use bisection to solve the same problem approached graphically in 
Example 5.1. 


Solution. The first step in bisection is to guess two values of the unknown (in the present 
problem, m) that give values for f(m) with different signs. From the graphical solution in 
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FIGURE 5.5 


A graphical depiction of the bisection method. This plot corresponds to the first four iterations 
from Example 5.3. 


Example 5.1, we can see that the function changes sign between values of 50 and 200. The 
plot obviously suggests better initial guesses, say 140 and 150, but for illustrative purposes 
let’s assume we don’t have the benefit of the plot and have made conservative guesses. 
Therefore, the initial estimate of the root x, lies at the midpoint of the interval 


r 


x = 504.200 _ 195 


Note that the exact value of the root is 142.7376. This means that the value of 125 calcu- 
lated here has a true percent relative error of 


142.7376 — 125 
142.7376 


x 100% = 12.43% 


le] = 


Next we compute the product of the function value at the lower bound and at the midpoint: 
f(50)f(125) = —4.579(—0.409) = 1.871 


which is greater than zero, and hence no sign change occurs between the lower bound and 
the midpoint. Consequently, the root must be located in the upper interval between 125 and 
200. Therefore, we create a new interval by redefining the lower bound as 125. 
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At this point, the new interval extends from x, = 125 to x, = 200. A revised root esti- 
mate can then be calculated as 


me 1233200 = 162.5 


r 


which represents a true percent error of |€,, = 13.85%. The process can be repeated to obtain 
refined estimates. For example, 


f(125) f(162.5) = —0.409(0.359) = —0.147 


Therefore, the root is now in the lower interval between 125 and 162.5. The upper bound is 
redefined as 162.5, and the root estimate for the third iteration is calculated as 


r 


125 i 16s Sqaane 


which represents a percent relative error of €, = 0.709%. The method can be repeated until 
the result is accurate enough to satisfy your needs. 


EXAMPLE 5.4 


We ended Example 5.3 with the statement that the method could be continued to ob- 
tain a refined estimate of the root. We must now develop an objective criterion for deciding 
when to terminate the method. 

An initial suggestion might be to end the calculation when the error falls below some 
prespecified level. For instance, in Example 5.3, the true relative error dropped from 12.43 
to 0.709% during the course of the computation. We might decide that we should terminate 
when the error drops below, say, 0.5%. This strategy is flawed because the error estimates 
in the example were based on knowledge of the true root of the function. This would not be 
the case in an actual situation because there would be no point in using the method if we 
already knew the root. 

Therefore, we require an error estimate that is not contingent on foreknowledge of the 
root. One way to do this is by estimating an approximate percent relative error as in [recall 
Eq. (4.5)] 


new _ „old 
Ý 


100% (5.5) 


l€al = 


new 
Xx; 


where x?” is the root for the present iteration and x°" is the root from the previous itera- 
tion. When e, becomes less than a prespecified stopping criterion ¢,, the computation is 
terminated. 


Error Estimates for Bisection 
Problem Statement. Continue Example 5.3 until the approximate error falls below a 
stopping criterion of €, = 0.5%. Use Eq. (5.5) to compute the errors. 


Solution. The results of the first two iterations for Example 5.3 were 125 and 162.5. Sub- 
stituting these values into Eq. (5.5) yields 


_ | 162.5 — L _ 
Eal = 162.5 100% = 23.08% 
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Recall that the true percent relative error for the root estimate of 162.5 was 13.85%. There- 
fore, |€,| is greater than |e,|. This behavior is manifested for the other iterations: 


Iteration x, Xu xX, |€a| (%) JE] (%) 
1 50 200 25 12.43 
2 125 200 62.5 23.08 13.85 
3 125 162.5 43.75 13.04 0.71 
4 125 143.75 34.375 6.98 5.86 
5 134.375 143.75 139.0625 3.37 2.58 
6 139.0625 143.75 41.4063 1.66 0.93 
7 141.4063 143.75 142.5781 0.82 0.11 
8 142.5781 143.75 43.1641 0.41 0.30 


Thus after eight iterations je,] finally falls below £, = 0.5%, and the computation can be 
terminated. 

These results are summarized in Fig. 5.6. The “ragged” nature of the true error is due 
to the fact that, for bisection, the true root can lie anywhere within the bracketing interval. 
The true and approximate errors are far apart when the interval happens to be centered 
on the true root. They are close when the true root falls at either end of the interval. 


FIGURE 5.6 
Errors for the bisection method. True and approximate errors are plotted versus the number 
of iterations. 
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Although the approximate error does not provide an exact estimate of the true error, 
Fig. 5.6 suggests that |e,| captures the general downward trend of je,}. In addition, the plot 
exhibits the extremely attractive characteristic that |e,| is always greater than |e,. Thus, 
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when |e,| falls below £, the computation could be terminated with confidence that the root 
is known to be at least as accurate as the prespecified acceptable level. 

While it is dangerous to draw general conclusions from a single example, it can be 
demonstrated that |e,| will always be greater than je, for bisection. This is due to the fact 
that each time an approximate root is located using bisection as x, = (x, + x,)/2, we know 
that the true root lies somewhere within an interval of Ax = x, — x, Therefore, the root 
must lie within +Ax/2 of our estimate. For instance, when Example 5.4 was terminated, 
we could make the definitive statement that 


x, = 143.1641 + 148-7900 — 142.3781 L 143.1641 + 0.5859 

In essence, Eq. (5.5) provides an upper bound on the true error. For this bound to 
be exceeded, the true root would have to fall outside the bracketing interval, which by 
definition could never occur for bisection. Other root-locating techniques do not always 
behave as nicely. Although bisection is generally slower than other methods, the neatness 
of its error analysis is a positive feature that makes it attractive for certain engineering and 
scientific applications. 

Another benefit of the bisection method is that the number of iterations required to 
attain an absolute error can be computed a priori—that is, before starting the computation. 
This can be seen by recognizing that before starting the technique, the absolute error is 


0_ 10_ ,0_ a0 
E, =X, — xX, = Ax 


where the superscript designates the iteration. Hence, before starting the method we are at 
the “zero iteration.” After the first iteration, the error becomes 
Ax? 

2 


Because each succeeding iteration halves the error, a general formula relating the error and 
the number of iterations n is 


y R Ax? 
E, = qn 
If E,,, is the desired error, this equation can be solved for? 
log(Ax°/E,,) l (e 
aeo = log, 


~ log? Eva 


E! = 


(5.6) 


Let’s test the formula. For Example 5.4, the initial interval was Ax, = 200 — 50 = 150. 
After eight iterations, the absolute error was 


E = [143.7500 — 142.5781| 


A 5 = 0.5859 


We can substitute these values into Eq. (5.6) to give 
n = log,(150/0.5859) = 8 
? MATLAB provides the 1og2 function to evaluate the base-2 logarithm directly. If the pocket calculator or 


computer language you are using does not include the base-2 logarithm as an intrinsic function, this equation 
shows a handy way to compute it. In general, log,(x) = log(x)/log(b). 
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Thus, if we knew beforehand that an error of less than 0.5859 was acceptable, the formula 
tells us that eight iterations would yield the desired result. 

Although we have emphasized the use of relative errors for obvious reasons, there 
will be cases where (usually through knowledge of the problem context) you will be able 
to specify an absolute error. For these cases, bisection along with Eq. (5.6) can provide a 
useful root-location algorithm. 


5.4.1 MATLAB M-file: bisect 


An M-file to implement bisection is displayed in Fig. 5.7. It is passed the function (func) 
along with lower (x1) and upper (xu) guesses. In addition, an optional stopping criterion (es) 


FIGURE 5.7 
An M-file to implement the bisection method. 


function [root, fx,ea, iter ]=bisect(func,x1,xu,es,maxit,varargin) 
% bisect: root location zeroes 
%  [root,fx,ea, iter ]=bisect(func,x1,xu,es,maxit,p1,p2,...): 


% uses bisection method to find the root of func 
% input: 

% func = name of function 

% xl, xu = lower and upper guesses 

% es = desired relative error (default = 0.0001%) 

% maxit = maximum allowable iterations (default = 50) 
% pl,p2,... = additional parameters used by func 

% output: 

% root = real root 

% x = function value at root 

% ea = approximate relative error (%) 


% iter = number of iterations 


if nargin<3,error('at least 3 input arguments required'),end 
test = func(x1,varargin{:})*func(xu,varargin{:}); 
if test>0,error('no sign change'),end 
if nargin<4|isempty(es), es=0.0001;end 
if nargin<5|isempty(maxit), maxit=50;end 
iter = 0; xr = xl; ea = 100; 
while (1) 
xrold = xr; 
xr = (xl + xu)/2; 
iter = iter + 1; 
if xr ~= 0,ea = abs((xr - xrold)/xr) * 100;end 
test = func(xl,varargin{:})*func(xr,varargin{:}); 


if test < 0 
XU = XI; 
elseif test > 0 
xl = xr; 
else 
ea = 0; 
end 
if ea <= es | iter >= maxit,break,end 
end 


root = xr; fx = func(xr, varargin{:}); 
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5.5 


and maximum iterations (maxit) can be entered. The function first checks whether there 
are sufficient arguments and if the initial guesses bracket a sign change. If not, an error 
message is displayed and the function is terminated. It also assigns default values if maxit 
and es are not supplied. Then a while. . .break loop is employed to implement the bisection 
algorithm until the approximate error falls below es or the iterations exceed maxit. 

We can employ this function to solve the problem posed at the beginning of the chapter. 
Recall that you need to determine the mass at which a bungee jumper’s free-fall velocity 
exceeds 36 m/s after 4 s of free fall given a drag coefficient of 0.25 kg/m. Thus, you have to 
find the root of 


jo = YIEE em YER 3) -26 


In Example 5.1 we generated a plot of this function versus mass and estimated that the root 
fell between 140 and 150 kg. The bisect function from Fig. 5.7 can be used to determine 
the root with the following script 


fm=@(m,cd,t,v) sqrt(9.81*m/cd)*tanh(sqrt(9.81*cd/m)*t)-v; 
[mass fx ea iter]=bisect(@(m) fm(m,0.25,4,36),40,200) 


mass = 
142.7377 
fx = 
4.6089e-007 
ea = 
5.345e-005 
iter = 
21 


Thus, a result of m = 142.74 kg is obtained after 21 iterations with an approximate relative 
error of e, = 0.00005345%, and a function value close to zero. 


FALSE POSITION 


False position (also called the linear interpolation method) is another well-known bracket- 
ing method. It is very similar to bisection with the exception that it uses a different strategy 
to come up with its new root estimate. Rather than bisecting the interval, it locates the root 
by joining f(x,) and f(x,,) with a straight line (Fig. 5.8). The intersection of this line with 
the x axis represents an improved estimate of the root. Thus, the shape of the function in- 
fluences the new root estimate. Using similar triangles, the intersection of the straight line 
with the x axis can be estimated as (see Chapra and Canale, 2010, for details), 


=x — fada — X,) 
7" FQ) - FR) 


(5.7) 


This is the false-position formula. The value of x, computed with Eq. (5.7) then re- 
places whichever of the two initial guesses, x, or x,, yields a function value with the same 
sign as f(x,). In this way the values of x, and x, always bracket the true root. The process 
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F(x) 


SOW) 


FIGURE 5.8 
False position. 


is repeated until the root is estimated adequately. The algorithm is identical to the one for 
bisection (Fig. 5.7) with the exception that Eq. (5.7) is used. 


EXAMPLE 5.5 The False-Position Method 


Problem Statement. Use false position to solve the same problem approached graphi- 
cally and with bisection in Examples 5.1 and 5.3. 


Solution. As in Example 5.3, initiate the computation with guesses of x, = 50 and 
x, = 200. 


First iteration: 


x =50 fœ) = —4.579387 


x,=200 f(x) = 0.860291 


0.86029 1(50 — 200) 


—4.579387 — 0.860291 = 176.2773 


x, = 200 — 


which has a true relative error of 23.5%. 


Second iteration: 


f(x) fæ) = —2.592732 
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Therefore, the root lies in the first subinterval, and x, becomes the upper limit for the next 
iteration, x, = 176.2773. 


x, = 50 f(x) = —4.579387 


x,= 176.2773 f(x,) = 0.566174 


0.566174(50 — 176.2773) 
—4.579387 — 0.566174 


x, = 176.2773 — = 162.3828 


which has true and approximate relative errors of 13.76% and 8.56%, respectively. Addi- 
tional iterations can be performed to refine the estimates of the root. 


Although false position often performs better than bisection, there are other cases 
where it does not. As in the following example, there are certain cases where bisection 
yields superior results. 


EXAMPLE 5.6 A Case Where Bisection Is Preferable to False Position 
Problem Statement. Use bisection and false position to locate the root of 
fQ) =x 1 
between x = 0 and 1.3. 


Solution. Using bisection, the results can be summarized as 


Iteration x; x, x E, (%) €, (%) 
1 0 1:3 0.65 100.0 35 
2 0.65 1:3 0.975 33:3 2:5 
3 0.975 133 131375 14.3 13.8 
4 0.975 1.1375 1.05625 Tl 5.6 
5 0.975 1.05625 1.015625 4.0 1.6 


Thus, after five iterations, the true error is reduced to less than 2%. For false position, a 
very different outcome is obtained: 


Iteration x, x, x, E€, (%) €, (%) 
1 0 1.3 0.09430 90.6 
2 0.09430 1.3 0.18176 48.1 81.8 
3 0.18176 1.3 0.26287 30.9 73.7 
4 0.26287 1.3 0.33811 2253 66.2 
5 0.33811 1.53 0.40788 1741 59.2 


After five iterations, the true error has only been reduced to about 59%. Insight into 
these results can be gained by examining a plot of the function. As in Fig. 5.9, the curve 
violates the premise on which false position was based—that is, if f(x,) is much closer to 
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BY 


FIGURE 5.9 
Plot of f(x) = x! — 1, illustrating slow convergence of the false-position method. 


zero than f(x,,), then the root should be much closer to x, than to x, (recall Fig. 5.8). Because 
of the shape of the present function, the opposite is true. 


The foregoing example illustrates that blanket generalizations regarding root- 
location methods are usually not possible. Although a method such as false position is 
often superior to bisection, there are invariably cases that violate this general conclusion. 
Therefore, in addition to using Eq. (5.5), the results should always be checked by substi- 
tuting the root estimate into the original equation and determining whether the result is 
close to zero. 

The example also illustrates a major weakness of the false-position method: its one- 
sidedness. That is, as iterations are proceeding, one of the bracketing points will tend to 
stay fixed. This can lead to poor convergence, particularly for functions with significant 
curvature. Possible remedies for this shortcoming are available elsewhere (Chapra and 
Canale, 2010). 
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om oR Or-N-] BBN UPD Me GREENHOUSE GASES AND RAINWATER 


Background. It is well documented that the atmospheric levels of several so-called 
“greenhouse” gases have been increasing over the past 50 years. For example, Fig. 5.10 
shows data for the partial pressure of carbon dioxide (CO,) collected at Mauna Loa, 
Hawaii from 1958 through 2008. The trend in these data can be nicely fit with a quadratic 
polynomial,* 


Pco, = 9.012226(¢ — 1983)? + 1.418542(t — 1983) + 342.38309 


where pco, = CO, partial pressure (ppm). These data indicate that levels have increased 
alittle over 22% over the period from 315 to 386 ppm. 

One question that we can address is how this trend is affecting the pH of rainwater. 
Outside of urban and industrial areas, it is well documented that carbon dioxide is the pri- 
mary determinant of the pH of the rain. pH is the measure of the activity of hydrogen ions 
and, therefore, its acidity or alkalinity. For dilute aqueous solutions, it can be computed as 


pH = —log,.[H*] (5.8) 


where [H*] is the molar concentration of hydrogen ions. 
The following five equations govern the chemistry of rainwater: 


[H*][HCO;] 
K, = 10° ————*- (5.9) 
Ky Pco, 
FIGURE 5.10 
Average annual partial pressures of atmospheric carbon dioxide (ppm) measured at Mauna 


Loa, Hawaii. 
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3 In Part Four, we will learn how to determine such polynomials. 
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5.6 CASE STUDY continued 


[H*][CO;*] 
K = (le Our (5.11) 
K 
c= ae + [HCO}] + [C057] (5.12) 
0 = [HCO;] + 2[CO;] + [OH] - [H*] (5.13) 


where K,, = Henry’s constant, and K,, K,, and K, are equilibrium coefficients. The five 
unknowns are cp = total inorganic carbon, [HCO,] = bicarbonate, [CO7] = carbonate, 
[H+] = hydrogen ion, and [OH] = hydroxyl ion. Notice how the partial pressure of CO, 
shows up in Eqs. (5.9) and (5.12). 

Use these equations to compute the pH of rainwater given that Ky = 107'“*, 
K, = 10°, K, = 107'°°, and K, = 1074. Compare the results in 1958 when the Pco, Was 
315 and in 2008 when it was 386 ppm. When selecting a numerical method for your com- 
putation, consider the following: 


e You know with certainty that the pH of rain in pristine areas always falls between 2 
and 12. 
e You also know that pH can only be measured to two places of decimal precision. 


Solution. There are a variety of ways to solve this system of five equations. One way is 
to eliminate unknowns by combining them to produce a single function that only depends 
on [H+]. To do this, first solve Eqs. (5.9) and (5.10) for 


2 K 
[HCO,] = Toop MuPco: (5.14) 
K,[HCO, 
CO = Sea (5.15) 
Substitute Eq. (5.14) into (5.15) 
Rok 
[CO;*] = wa ara (5.16) 


Equations (5.14) and (5.16) can be substituted along with Eq. (5.11) into Eq. (5.13) to give 


K, KK, 


TOH] Kupco, SF Tomp rA SP ET — [H*] (S.J) 
Although it might not be immediately apparent, this result is a third-order polynomial in 
[H*]. Thus, its root can be used to compute the pH of the rainwater. 

Now we must decide which numerical method to employ to obtain the solution. There 
are two reasons why bisection would be a good choice. First, the fact that the pH always 
falls within the range from 2 to 12, provides us with two good initial guesses. Second, be- 
cause the pH can only be measured to two decimal places of precision, we will be satisfied 
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5.6 CASE STUDY continued 


with an absolute error of E,,; = +0.005. Remember that given an initial bracket and the 
desired error, we can compute the number of iteration a priori. Substituting the present 
values into Eq. (5.6) gives 

>> dx=12-2; 

>> Ead=0.005; 

>> n=1og2(dx/Ead) 

n= 

10.9658 


Eleven iterations of bisection will produce the desired precision. 

Before implementing bisection, we must first express Eq. (5.17) as a function. Because 
it is relatively complicated, we will store it as an M-file: 

function f = fpH(pH,pC02) 

K1=104-6.3;K2=104- 10. 3); Kw= 104-14; 

KH=104-1.46; 

H=104- pH; 

f =K1/(1e6*H) *KH*pCO2+2*K2*K1/(1e6*H) *KH*pCO2+Kw/H - H; 

We can then use the M-file from Fig. 5.7 to obtain the solution. Notice how we have set 
the value of the desired relative error (e, = 1 x 107°) at a very low level so that the iteration 
limit (maxit) is reached first so that exactly 11 iterations are implemented 

>> [pH1958 fx ea iter]=bisect(@(pH) fpH(pH,315),2,12,1le-8,11) 

pH1958 = 

5.6279 
fx = 
+2, /1163e—006 
ea = 
0.0868 
iter = 
11 


Thus, the pH is computed as 5.6279 with a relative error of 0.0868%. We can be con- 
fident that the rounded result of 5.63 is correct to two decimal places. This can be 


verified by performing another run with more iterations. For example, setting maxit to 
50 yields 


>> [pH1958 fx ea iter]=bisect(@(pH) fpH(pH,315),2,12,1e-8,50) 


pH1958 = 
5.6304 

fx 

.615e - 015 


I e i 


5.1690e - 009 


For 2008, the result is 
>> [pH2008 fx ea iter]=bisect(@(pH) fpH(pH,386),2,12,1e-8,50) 
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5.6 CASE STUDY continued 


pH2008 = 
5.5864 
fx = 
3.2926e-015 
eals 
5.2098e - 009 
iter = 
35 


Interestingly, the results indicate that the 22.5% rise in atmospheric CO, levels has 
produced only a 0.78% drop in pH. Although this is certainly true, remember that the pH 
represents a logarithmic scale as defined by Eq. (5.8). Consequently, a unit drop in pH rep- 
resents an order-of-magnitude (i.e., a 10-fold) increase in the hydrogen ion. The concentra- 
tion can be computed as [H+] = 10™™ and its percent change can be calculated as 


>> ((104- pH2008 - 10^- pH1958) /104- pH1958) *100 


ansi= 
10.6791 


Therefore, the hydrogen ion concentration has increased about 10.7%. 

There is quite a lot of controversy related to the meaning of the greenhouse gas trends. 
Most of this debate focuses on whether the increases are contributing to global warming. 
However, regardless of the ultimate implications, it is sobering to realize that something 
as large as our atmosphere has changed so much over a relatively short time period. This 
case study illustrates how numerical methods and MATLAB can be employed to analyze 
and interpret such trends. Over the coming years, engineers and scientists can hopefully 
use such tools to gain increased understanding of such phenomena and help rationalize the 
debate over their ramifications. 


PROBLEMS 


5.1 Use bisection to determine the drag coefficient needed 
so that an 95-kg bungee jumper has a velocity of 46 m/s after 
9 s of free fall. Note: The acceleration of gravity is 9.81 m/s’. 
Start with initial guesses of x, = 0.2 and x, = 0.5 and iterate 
until the approximate relative error falls below 5%. 

5.2 Develop your own M-file for bisection in a similar fash- 
ion to Fig. 5.7. However, rather than using the maximum 
iterations and Eq. (5.5), employ Eq. (5.6) as your stopping 
criterion. Make sure to round the result of Eq. (5.6) up to the 
next highest integer (Hint: the cei] function provides a handy 
way to do this). The first line of your function should be 


function [root,Ea,ea,n] = bisectnew(func,x1,xu,Ead, 
varargin) 


Note that for the output, Ea = the approximate absolute error 
and ea = the approximate percent relative error. Then de- 
velop your own script, called LastNameHmwk04Script to 
solve Prob. 5.1. Note that you MUST pass the parameters 
via the argument. In addition, set up the function so that it 
uses a default value for Ead = 0.000001. 

5.3 Figure P5.3 shows a pinned-fixed beam subject to a uni- 
form load. The equation for the resulting deflections is 


-W MA 27434 73 
y= Ae 3Lx° + Lx) 
Develop a MATLAB script that 
(a) plots the function, dy/dx versus x (with appropriate 
labels) and 
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FIGURE P5.3 


(b) uses LastNameBisect to determine the point of maxi- 
mum deflection (i.e., the value of x where dy/dx = 0). 
Then substitute this value into the deflection equation to 
determine the value of the maximum deflection. Employ 
initial guesses of x, = 0 and x, = 0.9L. Use the follow- 
ing parameter values in your computation (making sure 
that you use consistent units): L = 400 cm, E = 52,000 
kN/cm?, I = 32,000 cm*, and w = 4 kN/cm. In addi- 
tion, use Ead = 0.0000001 m. Also, set format long in 
your script so you display 15 significant digits for your 
results. 

5.4 As shown in Fig. P5.4, the velocity of water, v (m/s), 

discharged from a cylindrical tank through a long pipe can 

be computed as 


v = y2gH tanh (eg 


where g = 9.81 m/s’, H = initial head (m), L = pipe length 

(m), and ¢ = elapsed time (s). Develop a MATLAB script that 

(a) plots the function f(H) versus H for H = 0 to 4 m (make 
sure to label the plot) and 

(b) uses LastNameBisect with initial guesses of x, = 0 and 
x, = 4m to determine the initial head needed to achieve 
v = 5 m/s in 2.5 s for a 4-m long pipe. In addition, use 
Ead = 0.0000001. Also, set format Jong in your script 
so you display 15 significant digits for your results. 

5.5 Repeat Prob. 5.1, but use the false-position method to 

obtain your solution. 

5.6 Develop an M-file for the false-position method. Test it 

by solving Prob. 5.1. 


FIGURE P5.4 


5.7 (a) Determine the roots of f(x) = —12 — 21x + 18x? — 
2.75x° graphically. In addition, determine the first root 
of the function with 

(b) bisection and (c) false position. For (b) and 

(c) use initial guesses of x, = —1 and x, = 0 and a stopping 
criterion of 1%. 

5.8 Locate the first nontrivial root of sin(x) = x? where x is 

in radians. Use a graphical technique and bisection with the 

initial interval from 0.5 to 1. Perform the computation until 

g, is less than £, = 2%. 

5.9 Determine the positive real root of In(x*) = 0.7 (a) 

graphically, (b) using three iterations of the bisection 

method, with initial guesses of x, = 0.5 and x, = 2, and 

(c) using three iterations of the false-position method, with 

the same initial guesses as in (b). 

5.10 The saturation concentration of dissolved oxygen in 

freshwater can be calculated with the equation 


1.575701 x 10° 
T, 


a 


1.243800 x 10!° 


In oy = —139.34411 + 


_ 6.642308 x 107 i 


T T; 
_ 8.621949 x 10" 
T4 
where oy = the saturation concentration of dissolved 


oxygen in freshwater at 1 atm (mg L™'); and T, = absolute 
temperature (K). Remember that T, = T + 273.15, where 
T = temperature (°C). According to this equation, saturation 
decreases with increasing temperature. For typical natural 
waters in temperate climates, the equation can be used to de- 
termine that oxygen concentration ranges from 14.621 mg/L 
at 0 °C to 6.949 mg/L at 35 °C. Given a value of oxygen 
concentration, this formula and the bisection method can be 
used to solve for temperature in °C. 

(a) If the initial guesses are set as 0 and 35 °C, how many 
bisection iterations would be required to determine tem- 
perature to an absolute error of 0.05 °C? 

(b) Based on (a), develop and test a bisection M-file function 
to determine T as a function of a given oxygen concen- 
tration. Test your function for o= 8, 10, and 14 mg/L. 
Check your results. 

5.11 A beam is loaded as shown in Fig. P5.11. Use the 

bisection method to solve for the position inside the beam 

where there is no moment. 

5.12 Water is flowing in a trapezoidal channel at a rate of 

Q = 20 m°/s. The critical depth y for such a channel must 

satisfy the equation 
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100 lb/ft 100 Ib 


FIGURE P5.11 


where g = 9.81 m/s’, A, = the cross-sectional area (m°), and 
B = the width of the channel at the surface (m). For this 
case, the width and the cross-sectional area can be related 
to depth y by 


B=3+y 
and 
y2 
A, = 3y + z 


Solve for the critical depth using (a) the graphical method, 
(b) bisection, and (c) false position. For (b) and (c) use 
initial guesses of x, = 0.5 and x, = 2.5, and iterate until the 
approximate error falls below 1% or the number of iterations 
exceeds 10. Discuss your results. 
5.13 The Michaelis-Menten model describes the kinetics of 
enzyme mediated reactions: 

Bp S 

dt mk FS 
where S = substrate concentration (moles/L), v,, = maxi- 
mum uptake rate (moles/L/d), and k, = the half-saturation 
constant, which is the substrate level at which uptake is half 
of the maximum [moles/L]. If the initial substrate level at 
t = Ois So, this differential equation can be solved for 


S = So — v,,t + k, In(S)/S) 
Develop an M-file to generate a plot of S versus ¢ for the 
case where Sy) = 8 moles/L, v,, = 0.7 moles/L/d, and k, = 


2.5 moles/L. 
5.14 A reversible chemical reaction 


3 
2A+B_C 


can be characterized by the equilibrium relationship 


C 

— c 

K= 2 
Ca Cp 


where the nomenclature c; represents the concentration of 
constituent i. Suppose that we define a variable x as repre- 
senting the number of moles of C that are produced. Conser- 
vation of mass can be used to reformulate the equilibrium 
relationship as 


(Ceo + X) 


g (Cao = 2x)” (Cpo =x) 


where the subscript 0 designates the initial concentration 

of each constituent. If K = 0.016, c, = 42, Cpo = 28, and 

Co = 4, determine the value of x. 

(a) Obtain the solution graphically. 

(b) On the basis of (a), solve for the root with initial guesses of 
x, = 0 and x, = 20 to £, = 0.5%. Choose either bisection or 
false position to obtain your solution. Justify your choice. 

5.15 Figure P5.15a shows a uniform beam subject to a lin- 

early increasing distributed load. The equation for the result- 

ing elastic curve is (see Fig. P5.15) 


Wo (5 4 2x3 — 14x) (P5.15) 


Y= T0ETL 


Use bisection to determine the point of maximum deflec- 
tion (i.e., the value of x where dy/dx = 0). Then substitute 
this value into Eq. (P5.15) to determine the value of the 
maximum deflection. Use the following parameter val- 
ues in your computation: L = 600 cm, E = 50,000 kN/cm?, 
I = 30,000 cm’, and Wy = 2.5 kN/cm. 


vAn 
I+ L >| 
(a) 
w=Ly=0) 
(« = 0, y = 0) : 
x 
(6) 
FIGURE P5.15 
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5.16 You buy a $35,000 vehicle for nothing down at $8500 
per year for 7 years. Use the bisect function from Fig. 5.7 
to determine the interest rate that you are paying. Employ 
initial guesses for the interest rate of 0.01 and 0.3 and a 
stopping criterion of 0.00005. The formula relating present 
worth P, annual payments A, number of years n, and interest 
rate i is 


id +i)" 


APAI 1 


5.17 Many fields of engineering require accurate population 
estimates. For example, transportation engineers might find 
it necessary to determine separately the population growth 
trends of a city and adjacent suburb. The population of the 
urban area is declining with time according to 


P(t) = Puma ™ + P 


u,max u,min 


while the suburban population is growing, as in 


s,max 


1+ [P ma/Po — Ue" 


P, ®© 

where P mav Ky Psmaxx Pos and k, = empirically derived pa- 
rameters. Determine the time and corresponding values of 
P(t) and P(t) when the suburbs are 20% larger than the 
city. The parameter values are P, max = 80,000, k, = 0.05/yr, 
Pimin = 110,000 people, P, na, = 320,000 people, Py = 
10,000 people, and k, = 0.09/yr. To obtain your solutions, 
use (a) graphical and (b) false-position methods. 
5.18 The resistivity p of doped silicon is based on the charge 
q on an electron, the electron density n, and the electron mo- 
bility u. The electron density is given in terms of the doping 
density N and the intrinsic carrier density n;. The electron 
mobility is described by the temperature T, the reference 
temperature Tọ, and the reference mobility jy. The equations 
required to compute the resistivity are 


where 


n= } (N + VN +47?) and H = ho (7, pane 


Determine N, given Tọ = 300 K, T = 1000 K, m = 
1360 cm? (V s)71, q=1.7x 10° e: n; = 6.21 x 10° cm”, 
and a desired p = 6.5 x 10° V scm/C. Employ initial guesses 
of N = 0 and 2.5 x 10". Use (a) bisection and (b) the false 
position method. 


Q 


FIGURE P5.19 


5.19 A total charge Q is uniformly distributed around a 
ring-shaped conductor with radius a. A charge q is located 
at a distance x from the center of the ring (Fig. P5.19). The 
force exerted on the charge by the ring is given by 


1 qQx 

where e; = 8.9 x 107’? C/(N m’). Find the distance x where 
the force is 1.25 N if q and Q are 2 x 107° C for a ring with a 
radius of 0.85 m. 

5.20 For fluid flow in pipes, friction is described by a di- 
mensionless number, the Fanning friction factor f. The 
Fanning friction factor is dependent on a number of param- 
eters related to the size of the pipe and the fluid, which can 
all be represented by another dimensionless quantity, the 
Reynolds number Re. A formula that predicts f given Re is 
the von Karman equation: 


Vi = 4 log, (Ref) — 0.4 


Typical values for the Reynolds number for turbulent flow 
are 10,000 to 500,000 and for the Fanning friction factor are 
0.001 to 0.01. Develop a function that uses bisection to solve 
for f given a user-supplied value of Re between 2500 and 
1,000,000. Design the function so that it ensures that the ab- 
solute error in the result is Æ, a < 0.000005. 

5.21 Mechanical engineers, as well as most other engineers, 
use thermodynamics extensively in their work. The follow- 
ing polynomial can be used to relate the zero-pressure spe- 
cific heat of dry air c, KJ/(kg K) to temperature (K): 


c, = 0.99403 + 1.671 x 107 + 9.7215 x 1087" 
-9.5838 x 107"! T° + 1.9520 x 107“ T* 


Develop a plot of c, versus a range of T = 0 to 1200 K, and 
then use bisection to determine the temperature that corre- 
sponds to a specific heat of 1.1 kJ/(kg K). 
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5.22 The upward velocity of a rocket can be computed by 
the following formula: 


where v = upward velocity, u = the velocity at which fuel is 
expelled relative to the rocket, mọ = the initial mass of the 
rocket at time t = 0, q = the fuel consumption rate, and g = the 
downward acceleration of gravity (assumed constant = 
9.81 m/s’). If u = 1800 m/s, my = 160,000 kg, and q = 
2600 kg/s, compute the time at which v = 750 m/s. (Hint: t 
is somewhere between 10 and 50 s.) Determine your result 
so that it is within 1% of the true value. Check your answer. 
5.23 Although we did not mention it in Sec. 5.6, Eq. (5.13) is 
an expression of electroneutrality—that is, that positive and 
negative charges must balance. This can be seen more clearly 
by expressing it as 


[H*] = [HCO;] + 2[C0}] + [OH™] 


In other words, the positive charges must equal the negative 
charges. Thus, when you compute the pH of a natural water 
body such as a lake, you must also account for other ions that 
may be present. For the case where these ions originate from 
nonreactive salts, the net negative minus positive charges 
due to these ions are lumped together in a quantity called 
alkalinity, and the equation is reformulated as 


Alk + [H*] = [HCO3] + 2[C077] + [OH] (P5.23) 


where Alk = alkalinity (eq/L). For example, the alkalinity of 
Lake Superior is approximately 0.4 x 10° eq/L. Perform the 
same calculations as in Sec. 5.6 to compute the pH of Lake 
Superior in 2008. Assume that just like the raindrops, the 
lake is in equilibrium with atmospheric CO, but account for 
the alkalinity as in Eq. (P5.23). 

5.24 According to Archimedes’ principle, the buoyancy force 
is equal to the weight of fluid displaced by the submerged 
portion of the object. For the sphere depicted in Fig. P5.24, 
use bisection to determine the height, h, of the portion that is 
above water. Employ the following values for your computa- 
tion: r= 1 m, p, = density of sphere = 200 kg/m’, and p, = 
density of water = 1000 kg/m’. Note that the volume of the 
above-water portion of the sphere can be computed with 


2, 
v= Gr-h) 


FIGURE P5.24 


5.25 Perform the same computation as in Prob. 5.24, but for 
the frustum of a cone as depicted in Fig. P5.25. Employ the 
following values for your computation: r; = 0.5 m, r, = 1 m, 
h = 1 m, p;= frustum density = 200 kg/m, and p,, = water 
density = 1000 kg/m*. Note that the volume of a frustum is 
given by 


V= a (r? +r +r) 


FIGURE P5.25 
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Roots: Open Methods 


CHAPTER OBJECTIVES 


The primary objective of this chapter is to acquaint you with open methods for finding 
the root of a single nonlinear equation. Specific objectives and topics covered are 


Recognizing the difference between bracketing and open methods for root 
location. 

Understanding the fixed-point iteration method and how you can evaluate its 
convergence characteristics. 

Knowing how to solve a roots problem with the Newton-Raphson method and 
appreciating the concept of quadratic convergence. 

Knowing how to implement both the secant and the modified secant methods. 
Understanding how Brent’s method combines reliable bracketing methods with 
fast open methods to locate roots in a robust and efficient manner. 

Knowing how to use MATLAB’s fzero function to estimate roots. 

Learning how to manipulate and determine the roots of polynomials with 
MATLAB. 


by a lower and an upper bound. Repeated application of these methods always results 
in closer estimates of the true value of the root. Such methods are said to be conver- 
gent because they move closer to the truth as the computation progresses (Fig. 6.1a). 

In contrast, the open methods described in this chapter require only a single starting 
value or two starting values that do not necessarily bracket the root. As such, they some- 
times diverge or move away from the true root as the computation progresses (Fig. 6. 1b). 
However, when the open methods converge (Fig. 6.1c) they usually do so much more 
quickly than the bracketing methods. We will begin our discussion of open techniques with 
a simple approach that is useful for illustrating their general form and also for demonstrat- 
ing the concept of convergence. 


=- or the bracketing methods in Chap. 5, the root is located within an interval prescribed 
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je 


FIGURE 6.1 

Graphical depiction of the fundamental difference between the (a) bracketing and (b) and (c) 
open methods for root location. In (a), which is bisection, the root is constrained within the 
interval prescribed by x, and x,,. In contrast, for the open method depicted in (b) and (c), which 
is Newton-Raphson, a formula is used to project from x; to x,,, in an iterative fashion. Thus 
the method can either (b) diverge or (c) converge rapidly, depending on the shape of the 
function and the value of the initial guess. 


SIMPLE FIXED-POINT ITERATION 


As just mentioned, open methods employ a formula to predict the root. Such a formula can 
be developed for simple fixed-point iteration (or, as it is also called, one-point iteration or 
successive substitution) by rearranging the function f(x) = O so that x is on the left-hand 
side of the equation: 


x= g(x) (6.1) 


This transformation can be accomplished either by algebraic manipulation or by simply 
adding x to both sides of the original equation. 

The utility of Eq. (6.1) is that it provides a formula to predict a new value of x as a 
function of an old value of x. Thus, given an initial guess at the root x, Eq. (6.1) can be used 
to compute a new estimate x;,, as expressed by the iterative formula 


Xii = 8(%;) (6.2) 
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As with many other iterative formulas in this book, the approximate error for this equation 
can be determined using the error estimator: 


X. =e 
=| — ‘hoo (6.3) 


i+] 


Simple Fixed-Point Iteration 


X 


Problem Statement. Use simple fixed-point iteration to locate the root of f(x) = e™ — x. 


Solution. The function can be separated directly and expressed in the form of Eq. (6.2) as 


Starting with an initial guess of x) = 0, this iterative equation can be applied to compute: 


i X legl, % led, % led;/led;-1 
0 0.0000 100.000 
1 1.0000 100.000 76.322 0.763 
2 0.3679 171.828 35.135 0.460 
3 0.6922 46.854 22.050 0.628 
4 0.5005 38.309 11.755 0.533 
5 0.6062 17.447 6.894 0.586 
6 0.5454 11.157 35835 0.556 
7 0.5796 5.903 2.199 0.573 
8 0.5601 3.481 1.239 0.564 
9 0.5711 1.931 0.705 0.569 
10 0.5649 1.109 0.399 0.566 


Thus, each iteration brings the estimate closer to the true value of the root: 0.56714329. 


Notice that the true percent relative error for each iteration of Example 6.1 is roughly 
proportional (for this case, by a factor of about 0.5 to 0.6) to the error from the previ- 
ous iteration. This property, called linear convergence, is characteristic of fixed-point 
iteration. 

Aside from the “rate” of convergence, we must comment at this point about the “pos- 
sibility” of convergence. The concepts of convergence and divergence can be depicted 
graphically. Recall that in Sec. 5.2, we graphed a function to visualize its structure and 
behavior. Such an approach is employed in Fig. 6.2a for the function f(x) = e™ — x. An 
alternative graphical approach is to separate the equation into two component parts, as in 


AiO) = fa) 
Then the two equations 

y, =fi@) (6.4) 
and 


y = fr) (6.5) 
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FIGURE 6.2 
Two alternative graphical methods for determining the root of f(x) = e™ — x. (a) Root at the 
point where it crosses the x axis; (b) root at the intersection of the component functions. 


can be plotted separately (Fig. 6.2b). The x values corresponding to the intersections of 
these functions represent the roots of f(x) = 0. 

The two-curve method can now be used to illustrate the convergence and divergence 
of fixed-point iteration. First, Eq. (6.1) can be reexpressed as a pair of equations y, = x and 
Yy, = g(x). These two equations can then be plotted separately. As was the case with Eqs. 
(6.4) and (6.5), the roots of f(x) = 0 correspond to the abscissa value at the intersection of 
the two curves. The function y, = x and four different shapes for y, = g(x) are plotted in 
Fig. 6.3. 

For the first case (Fig. 6.3a), the initial guess of x9 is used to determine the cor- 
responding point on the y, curve [xo, g(Xo)]. The point [x,, xı] is located by moving left 
horizontally to the y, curve. These movements are equivalent to the first iteration of 
the fixed-point method: 


xX, = 8%) 
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y= 8(x) 
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FIGURE 6.3 

“Cobweb plots” depicting convergence (a and b) and divergence (c and d). Graphs (a) and 
(c) are called monotone patterns whereas (b) and (c) are called oscillating or spiral patterns. 
Note that convergence occurs when |g’ (x)| < 1. 


Thus, in both the equation and in the plot, a starting value of x, is used to obtain an estimate 
of x,. The next iteration consists of moving to [x,, g(x,)] and then to [x,, x,]. This iteration 
is equivalent to the equation 


X = 8x) 

The solution in Fig. 6.3a is convergent because the estimates of x move closer to the 
root with each iteration. The same is true for Fig. 6.3b. However, this is not the case for 
Fig. 6.3c and d, where the iterations diverge from the root. 

A theoretical derivation can be used to gain insight into the process. As described in 
Chapra and Canale (2010), it can be shown that the error for any iteration is linearly propor- 
tional to the error from the previous iteration multiplied by the absolute value of the slope of g: 


E1 = 8'G)E,; 
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Consequently, if |g’| < 1, the errors decrease with each iteration. For |g’| > 1 the errors 
grow. Notice also that if the derivative is positive, the errors will be positive, and hence the 
errors will have the same sign (Fig. 6.3a and c). If the derivative is negative, the errors will 
change sign on each iteration (Fig. 6.35 and d). 


NEWTON-RAPHSON 


Perhaps the most widely used of all root-locating formulas is the Newton-Raphson 
method (Fig. 6.4). If the initial guess at the root is x; a tangent can be extended from 
the point [x,, f(x;)]. The point where this tangent crosses the x axis usually represents an 
improved estimate of the root. 

The Newton-Raphson method can be derived on the basis of this geometrical interpre- 
tation. As in Fig. 6.4, the first derivative at x is equivalent to the slope: 

PG) =e? 


i+] 


which can be rearranged to yield 
= fx) 
i FE) 


which is called the Newton-Raphson formula. 


k SK (6.6) 


i+] 


Newton-Raphson Method 
Problem Statement. Use the Newton-Raphson method to estimate the root of f(x) = 


e™ —x employing an initial guess of x) = 0. 


Solution. The first derivative of the function can be evaluated as 


f'@)=-e*-1 


which can be substituted along with the original function into Eq. (6.6) to give 


Xa =x, -— —— 
1 = 
It L —e | 


Starting with an initial guess of x9 = 0, this iterative equation can be applied to compute 


i X; lel, % 

0 0 100 

1 0.500000000 11.8 

2 0.566311003 0.147 

3 0.567143165 0.0000220 
4 0.567143290 <10° 


Thus, the approach rapidly converges on the true root. Notice that the true percent relative 
error at each iteration decreases much faster than it does in simple fixed-point iteration 
(compare with Example 6.1). 
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F(x) 4 


Slope = f'(x) 
f(x) 


fx) - 0 


=T 


FIGURE 6.4 
Graphical depiction of the Newton-Raphson method. A tangent to the function of x, [i.e., f(x] 
is extrapolated down to the x axis to provide an estimate of the root at xX; 


As with other root-location methods, Eq. (6.3) can be used as a termination criterion. 
In addition, a theoretical analysis (Chapra and Canale, 2010) provides insight regarding the 
rate of convergence as expressed by 


S) 2 
tit] T af" (x ) Eii (6.7) 


Thus, the error should be roughly proportional to the square of the previous error. In other 
words, the number of significant figures of accuracy approximately doubles with each 
iteration. This behavior is called quadratic convergence and is one of the major reasons for 
the popularity of the method. 

Although the Newton-Raphson method is often very efficient, there are situations 
where it performs poorly. A special case—multiple roots—is discussed elsewhere (Chapra 
and Canale, 2010). However, even when dealing with simple roots, difficulties can also 
arise, as in the following example. 


A Slowly Converging Function with Newton-Raphson 
Problem Statement. Determine the positive root of f(x) = x'? — 1 using the Newton- 
Raphson method and an initial guess of x = 0.5. 
Solution. The Newton-Raphson formula for this case is 
x, —1 


l 


10x, 


Xi = Xi 


which can be used to compute 
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i X; leal, % 
0 0.5 

1 51:65 99.032 
2 46.485 11.111 
3 41.8365 11.111 
4 37.65285 11.111 
40 1.002316 2.130 
41 1.000024 0.229 
42 1 0.002 


Thus, after the first poor prediction, the technique is converging on the true root of 1, but 
at a very slow rate. 

Why does this happen? As shown in Fig. 6.5, a simple plot of the first few iterations 
is helpful in providing insight. Notice how the first guess is in a region where the slope is 
near zero. Thus, the first iteration flings the solution far away from the initial guess to a new 
value (x = 51.65) where f(x) has an extremely high value. The solution then plods along for 
over 40 iterations until converging on the root with adequate accuracy. 


FIGURE 6.5 

Graphical depiction of the Newton-Raphson method for a case with slow convergence. The 
inset shows how a near-zero slope initially shoots the solution far from the root. Thereafter, 
the solution very slowly converges on the root. 


SO) 
2E+17 


Aside from slow convergence due to the nature of the function, other difficulties can 
arise, as illustrated in Fig. 6.6. For example, Fig. 6.6a depicts the case where an inflection 


172 ROOTS: OPEN METHODS 


(b) 


(c) 


fA 


(a) 


FIGURE 6.6 
Four cases where the Newton-Raphson method exhibits poor convergence. 
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point (i.e., f” (x) = 0) occurs in the vicinity of a root. Notice that iterations beginning at 
Xo progressively diverge from the root. Fig. 6.6b illustrates the tendency of the Newton- 
Raphson technique to oscillate around a local maximum or minimum. Such oscillations 
may persist, or, as in Fig. 6.6b, a near-zero slope is reached whereupon the solution is sent 
far from the area of interest. Figure 6.6c shows how an initial guess that is close to one 
root can jump to a location several roots away. This tendency to move away from the area 
of interest is due to the fact that near-zero slopes are encountered. Obviously, a zero slope 
[f'(x) = 0] is a real disaster because it causes division by zero in the Newton-Raphson 
formula [Eq. (6.6)]. As in Fig. 6.6d, it means that the solution shoots off horizontally and 
never hits the x axis. 

Thus, there is no general convergence criterion for Newton-Raphson. Its convergence 
depends on the nature of the function and on the accuracy of the initial guess. The only 
remedy is to have an initial guess that is “sufficiently” close to the root. And for some func- 
tions, no guess will work! Good guesses are usually predicated on knowledge of the physi- 
cal problem setting or on devices such as graphs that provide insight into the behavior of 
the solution. It also suggests that good computer software should be designed to recognize 
slow convergence or divergence. 


6.2.1 MATLAB M-file: newtraph 


An algorithm for the Newton-Raphson method can be easily developed (Fig. 6.7). Note 
that the program must have access to the function (func) and its first derivative (dfunc). 
These can be simply accomplished by the inclusion of user-defined functions to compute 
these quantities. Alternatively, as in the algorithm in Fig. 6.7, they can be passed to the 
function as arguments. 

After the M-file is entered and saved, it can be invoked to solve for root. For example, 
for the simple function x? — 9, the root can be determined as in 


>> newtraph(@(x) x42-9,@(x) 2*x,5) 
ans = 
3 
Newton-Raphson Bungee Jumper Problem 


Problem Statement. Use the M-file function from Fig. 6.7 to determine the mass of the 
bungee jumper with a drag coefficient of 0.25 kg/m to have a velocity of 36 m/s after 4 s of 
free fall. The acceleration of gravity is 9.81 m/s”. 


Solution. The function to be evaluated is 


f(m) = y= tanh (qy 1) — v(t) (E6.4.1) 


To apply the Newton-Raphson method, the derivative of this function must be evalu- 
ated with respect to the unknown, m: 


dim _1./8 WE- E INEZA 
Im 7 ae, tanh m t z Seok m (E6.4.2) 
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function [root,ea, iter ]=newtraph( func ,dfunc,xr,es,maxit,varargin) 
% newtraph: Newton-Raphson root location zeroes 

[root ,ea, iter ]=newtraph(func,dfunc,xr,es,maxit,p1,p2,...): 
uses Nlewton-Raphson method to find the root of func 


func = name of function 
dfunc = name of derivative of function 
xr = initial guess 
% es = desired relative error (default = 0.0001%) 
% maxit = maximum allowable iterations (default = 50) 
% pl,p2,... = additional parameters used by function 
% output: 
% root = real root 
% ea = approximate relative error (%) 
% iter = number of iterations 


% 
% 
% input: 
% 
% 
% 


if nargin<3,error('at least 3 input arguments required'),end 
if nargin<4|isempty(es) ,es=0.0001; end 
if nargin<5|isempty(maxit) ,maxit=50;end 
iter = 0; 
while (1) 
xrold = xr; 
xr = xr — func(xr)/dfunc(xr) ; 
iter = iter + 1; 
if xr ~= 0, ea = abs((xr - xrold)/xr) * 100; end 
if ea <= es | iter >= maxit, break, end 


end 
root = xr; 
FIGURE 6.7 


An M-file to implement the Newton-Raphson method. 


We should mention that although this derivative is not difficult to evaluate in principle, it 
involves a bit of concentration and effort to arrive at the final result. 

The two formulas can now be used in conjunction with the function newtraph to evalu- 
ate the root: 
>> y = @(m) sqrt(9.81*m/0.25)*tanh(sqrt(9.81*0.25/m)*4)-36; 
>> dy = @(m) 1/2*sqrt(9.81/(m*0.25))*tanh((9.81*0.25/m) ... 

A(1/2)*4)-9.81/(2*m)*sech(sqrt(9.81*0.25/m)*4)A2: 

>> newtraph(y,dy,140,0.00001) 


ans = 
142.7376 


6.3 


SECANT METHODS 


As in Example 6.4, a potential problem in implementing the Newton-Raphson method is 
the evaluation of the derivative. Although this is not inconvenient for polynomials and 
many other functions, there are certain functions whose derivatives may be difficult or 


6.3 SECANT METHODS 175 


EXAMPLE 6.5 


inconvenient to evaluate. For these cases, the derivative can be approximated by a back- 
ward finite divided difference: 


fad -f 


1 
P= Xi — %; 


This approximation can be substituted into Eq. (6.6) to yield the following iterative 
equation: 


Joda = x) 


i Fæ- (6.8) 


Xi =X 
Equation (6.8) is the formula for the secant method. Notice that the approach requires two 
initial estimates of x. However, because f(x) is not required to change signs between the 
estimates, it is not classified as a bracketing method. 

Rather than using two arbitrary values to estimate the derivative, an alternative ap- 


proach involves a fractional perturbation of the independent variable to estimate f'(x), 


IQ; + ôx;) -f œ) 


PO) = 5x, 


where 6 = a small perturbation fraction. This approximation can be substituted into 
Eq. (6.6) to yield the following iterative equation: 


= OX; f) 
11 FG. + 6x) — Fa) 


x. (6.9) 


We call this the modified secant method. As in the following example, it provides a 
nice means to attain the efficiency of Newton-Raphson without having to compute 
derivatives. 


Modified Secant Method 
Problem Statement. Use the modified secant method to determine the mass of the bun- 
gee jumper with a drag coefficient of 0.25 kg/m to have a velocity of 36 m/s after 4 s of 
free fall. Note: The acceleration of gravity is 9.81 m/s”. Use an initial guess of 50 kg and a 
value of 10~° for the perturbation fraction. 
Solution. Inserting the parameters into Eq. (6.9) yields 
First iteration: 
Xo = 50 f(x) = —4.57938708 
Xo + 6%) = 50.00005 f (Xo + 6X) = —4.579381118 
10~°(50)(—4.57938708) 
—4.579381118 — (—4.57938708) 


= 88.3993 1(le] = 38.1%; le,| = 43.4%) 


x,=50- 
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Second iteration: 
x, = 88.39931 fœ) = —1.69220771 


xX, + dx, = 88.39940 S(&, + 6x,) = —1.692203516 


10-°(88.39931)(—1.69220771) 
—1.692203516 — (—1.69220771) 


x, = 88.39931 — 


= 124.08970(le,| = 13.1%; le,| = 28.76%) 


The calculation can be continued to yield 


i x; led, % leal, % 

0 50.0000 64.971 

1 88.3993 38.069 43.438 

2 124.0897 13.064 28.762 

3 140.5417 1.538 11.706 

4 142.7072 0.021 1.517 

5 142.7376 4.1 x 10-° 0.021 

6 142.7376 3.4 x 10-12 4.1 x 10-° 


6.4 


The choice of a proper value for 6 is not automatic. If 6 is too small, the method 
can be swamped by roundoff error caused by subtractive cancellation in the denomina- 
tor of Eq. (6.9). If it is too big, the technique can become inefficient and even divergent. 
However, if chosen correctly, it provides a nice alternative for cases where evaluating the 
derivative is difficult and developing two initial guesses is inconvenient. 

Further, in its most general sense, a univariate function is merely an entity that re- 
turns a single value in return for values sent to it. Perceived in this sense, functions are not 
always simple formulas like the one-line equations solved in the preceding examples in 
this chapter. For example, a function might consist of many lines of code that could take a 
significant amount of execution time to evaluate. In some cases, the function might even 
represent an independent computer program. For such cases, the secant and modified se- 
cant methods are valuable. 


BRENT’S METHOD 


Wouldn’t it be nice to have a hybrid approach that combined the reliability of bracketing 
with the speed of the open methods? Brent’s root-location method is a clever algorithm 
that does just that by applying a speedy open method wherever possible, but reverting to 
a reliable bracketing method if necessary. The approach was developed by Richard Brent 
(1973) based on an earlier algorithm of Theodorus Dekker (1969). 

The bracketing technique is the trusty bisection method (Sec. 5.4), whereas two dif- 
ferent open methods are employed. The first is the secant method described in Sec. 6.3. As 
explained next, the second is inverse quadratic interpolation. 
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6.4.1 Inverse Quadratic Interpolation 


Inverse quadratic interpolation is similar in spirit to the secant method. As in Fig. 6.84, 
the secant method is based on computing a straight line that goes through two guesses. The 
intersection of this straight line with the x axis represents the new root estimate. For this 
reason, it is sometimes referred to as a linear interpolation method. 

Now suppose that we had three points. In that case, we could determine a quadratic 
function of x that goes through the three points (Fig. 6.8b). Just as with the linear secant 
method, the intersection of this parabola with the x axis would represent the new root es- 
timate. And as illustrated in Fig. 6.8), using a curve rather than a straight line often yields 
a better estimate. 

Although this would seem to represent a great improvement, the approach has a fun- 
damental flaw: it is possible that the parabola might not intersect the x axis! Such would be 
the case when the resulting parabola had complex roots. This is illustrated by the parabola, 
y = f(x), in Fig. 6.9. 

The difficulty can be rectified by employing inverse quadratic interpolation. That 
is, rather than using a parabola in x, we can fit the points with a parabola in y. This 
amounts to reversing the axes and creating a “sideways” parabola [the curve, x = f(y), 
in Fig. 6.9]. 

If the three points are designated as (x>, Y2) (%_), Y1) and (x;, y;), a quadratic func- 
tion of y that passes through the points can be generated as 


je a G= raa 
Oi- =) Oo Fy) Oi-1 — Vi-DOW-1 — Yi) cl 
O = Y-20 — Y1) 
Yi = YD ya) (6.10) 
FIGURE 6.8 


Comparison of (a) the secant method and (b) inverse quadratic interpolation. Note that the 
approach in (b) is called “inverse” because the quadratic function is written in y rather than in x. 


fA fA 


xT 
xY 


(o) (b) 
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FIGURE 6.9 

Two parabolas fit to three points. The parabola written as a function of x, y = f(x), has com- 
plex roots and hence does not intersect the x axis. In contrast, if the variables are reversed, 
and the parabola developed as x = f(y), the function does intersect the x axis. 


As we will learn in Sec. 18.2, this form is called a Lagrange polynomial. The root, x 
corresponds to y = 0, which when substituted into Eq. (6.10) yields 


i+) 


Yin Vi Yi-2Vi 
Xi = Xi— + Xi— 
i OYi-2 — Yi- i-2 — YD) Š (Vi-1 — Yi-wVOiW-1 — Yı) i 


Yi=z2 Visi 


T Xi 
Oi = Yi-2)O%i — Yi-1) 


(6.11) 
As shown in Fig. 6.9, such a “sideways” parabola always intersects the x axis. 


Inverse Quadratic Interpolation 


Problem Statement. Develop quadratic equations in both x and y for the data points 
depicted in Fig. 6.9: (1, 2), (2, 1), and (4, 5). For the first, y = f(x), employ the quadratic 
formula to illustrate that the roots are complex. For the latter, x = g(y), use inverse quadratic 
interpolation (Eq. 6.11) to determine the root estimate. 


Solution. By reversing the x’s and y’s, Eq. (6.10) can be used to generate a quadratic in 
xas 


-0-20-47 &-De-4 E DE-2 


FO=q — 2) - 4) (2 — 1)2- 4) (4—- D4—2)> 


or collecting terms 


f@=xr-4x4+5 
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This equation was used to generate the parabola, y = f(x), in Fig. 6.9. The quadratic for- 
mula can be used to determine that the roots for this case are complex, 


4 + VAF - 46) 
x= — 
2 
Equation (6.10) can be used to generate the quadratic in y as 


Giwa w0- DOLD 
@-N0-5) T-3039 P-DE) 


241i 


g0) = 


or collecting terms: 
g0) = 0.5x? — 2.5x + 4 
Finally, Eq. (6.11) can be used to determine the root as 
—1(-5) —2(—5) =2(=1) 


a52- D-5)  0-2J0-5) teen g 


Before proceeding to Brent’s algorithm, we need to mention one more case where 
inverse quadratic interpolation does not work. If the three y values are not distinct (i.e., 
Yi-2 = Y;-1 OF y;_; = y;), an inverse quadratic function does not exist. So this is where the 
secant method comes into play. If we arrive at a situation where the y values are not distinct, 
we can always revert to the less efficient secant method to generate a root using two of 
the points. If y,_, = y;_,, we use the secant method with x,_, and x;. If y,_; = y, we use x,_, 
and x;_). 


6.4.2 Brent’s Method Algorithm 


The general idea behind the Brent’s root-finding method is whenever possible to use 
one of the quick open methods. In the event that these generate an unacceptable result 
(i.e., a root estimate that falls outside the bracket), the algorithm reverts to the more 
conservative bisection method. Although bisection may be slower, it generates an es- 
timate guaranteed to fall within the bracket. This process is then repeated until the 
root is located to within an acceptable tolerance. As might be expected, bisection typi- 
cally dominates at first but as the root is approached, the technique shifts to the faster 
open methods. 

Figure 6.10 presents a function based on a MATLAB M-file developed by Cleve Moler 
(2004). It represents a stripped down version of the fzero function which is the professional 
root-location function employed in MATLAB. For that reason, we call the simplified ver- 
sion: fzerosimp. Note that it requires another function f that holds the equation for which 
the root is being evaluated. 

The fzerosimp function is passed two initial guesses that must bracket the root. Then, 
the three variables defining the search interval (a,b,c) are initialized, and f is evaluated at 
the endpoints. 

A main loop is then implemented. If necessary, the three points are rearranged to 
satisfy the conditions required for the algorithm to work effectively. At this point, if the 
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function b = fzerosimp(x1, xu) 
a = xl; b = xu; fa = f(a); fb = f(b); 
c = a; fc = fa; d=b-c; e=d; 
while (1) 

if fb == 0, break, end 


if sign(fa) == sign(fb) %If needed, rearrange points 
a =c; fa=fc;d=b-c;e=d; 


end 
if abs(fa) < abs(fb) 
c=b;b=a;a=c; 
fc = fb; fb = fa; fa = fc; 
end 


m = 0.5*(a - b); %Termination test and possible exit 
tol = 2 * eps * max(abs(b), 1); 
if abs(m) <= tol | fb = 0. 
break 
end 
%Choose open methods or bisection 
if abs(e) >= tol & abs(fc) > abs(fb) 


ifa=c %Secant method 


else “Inverse quadratic interpolation 
c/fa; r = fb/fa; 

: ira" Qn- (b= clean tN 
g = 1)*(r - 1)*(s - 1); 


if p > 0, q = -q; else p = -p; end; 

if 2*p < 3*m*q - abs(tol*q) & p < abs(0.5*e*q) 
ee ah al = pq; 

else 


else %Bisection 


end 
c = b; fc = fb; 
if abs(d) > tol, b=b+d; else b=b-sign(b-a)*tol; end 
fb = f(b); 
end 


FIGURE 6.10 


Function for Brent’s root-finding algorithm based on a MATLAB M-fil 
Moler (2004). 


developed by Clev 
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6.5 


stopping criteria are met, the loop is terminated. Otherwise, a decision structure chooses 
among the three methods and checks whether the outcome is acceptable. A final section 
then evaluates f at the new point and the loop is repeated. Once the stopping criteria are 
met, the loop terminates and the final root estimate is returned. 


MATLAB FUNCTION: fzero 


The fzero function is designed to find the real root of a single equation. A simple represen- 
tation of its syntax is 


fzero( function, x0) 


where function is the name of the function being evaluated, and x0is the initial guess. Note 
that two guesses that bracket the root can be passed as a vector: 


fzero(function,[x0 x1]) 


where x0 and x1 are guesses that bracket a sign change. 
Here is a MATLAB session that solves for the root of a simple quadratic: x* — 9. 
Clearly two roots exist at —3 and 3. To find the negative root: 


>> x = fzero(@(x) x*2-9,-4) 


x= 
=3 


If we want to find the positive root, use a guess that is near it: 


>> x = fzero(@(x) x*2-9,4) 


x= 
3 


If we put in an initial guess of zero, it finds the negative root: 
>> x = fzero(@(x) x*2-9,0) 


x= 
=3 


If we wanted to ensure that we found the positive root, we could enter two guesses as in 
>> x = fzero(@(x) x*2-9,[0 4]) 


x= 
3 


Also, if a sign change does not occur between the two guesses, an error message is displayed 
>> x = fzero(@(x) x*2-9,[-4 4]) 
??? Error using ==> fzero 


The function values at the interval endpoints must ... 
differ in sign. 
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EXAMPLE 6.7 


The fzero function works as follows. If a single initial guess is passed, it first performs 
a search to identify a sign change. This search differs from the incremental search described 
in Sec. 5.3.1, in that the search starts at the single initial guess and then takes increasingly 
bigger steps in both the positive and negative directions until a sign change is detected. 

Thereafter, the fast methods (secant and inverse quadratic interpolation) are used un- 
less an unacceptable result occurs (e.g., the root estimate falls outside the bracket). If an 
unacceptable result happens, bisection is implemented until an acceptable root is obtained 
with one of the fast methods. As might be expected, bisection typically dominates at first 
but as the root is approached, the technique shifts to the faster methods. 

A more complete representation of the fzero syntax can be written as 


[x, fx] = fzero( function, x0,options,p1,p2,...) 


where [x, fx] = a vector containing the root x and the function evaluated at the root fx, op- 
tions is a data structure created by the optimset function, and p1, p2... are any parameters 
that the function requires. Note that if you desire to pass in parameters but not use the 
options, pass an empty vector [] in its place. 

The optimset function has the syntax 


options = optimset('par,',val,,'par>',Vval>,...) 


where the parameter par; has the value va7;. A complete listing of all the possible param- 
eters can be obtained by merely entering optimset at the command prompt. The parameters 
that are commonly used with the fzero function are 


display: When set to 'iter' displays a detailed record of all the iterations. 
tolx: A positive scalar that sets a termination tolerance on x. 


The fzero and optimset Functions 


Problem Statement. Recall that in Example 6.3, we found the positive root of f(x) = 
x'° — 1 using the Newton-Raphson method with an initial guess of 0.5. Solve the same 


problem with optimset and fzero. 


Solution. An interactive MATLAB session can be implemented as follows: 


>> options = optimset('display','iter'); 
>> [x, fx] = fzero(@(x) x*10-1,0.5,options) 


Func-count X f(x) Procedure 
1 0.5 -0.999023 initial 
2 0.485858 -0.999267 search 
3 0.514142 -0.998709 search 
4 0.48 -0.999351 search 
5 0.52 -0.998554 search 
6 0.471716 -0.999454 search 
23 0.952548 -0.385007 search 
24 -0.14 -1 search 


25 1.14 2.70722 search 
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Looking for a zero in the interval [-0.14, 1.14] 


26 0.205272 i! interpolation 
27 0.672636 -0.981042 bisection 
28 0.906318 -0.626056 bisection 
29 1.02316 0.257278 bisection 
30 0.989128 -0.103551 interpolation 
31 0.998894 -0.0110017 interpolation 
32 1.00001 7.68385e-005 interpolation 
33 1 -3.83061e-007 interpolation 
34 1 -1.3245e-011 interpolation 
35 1 0 interpolation 
Zero found in the interval: [-0.14, 1.14]. 
x= 
1 
fx = 
0 


Thus, after 25 iterations of searching, fzero finds a sign change. It then uses interpola- 
tion and bisection until it gets close enough to the root so that interpolation takes over and 
rapidly converges on the root. 

Suppose that we would like to use a less stringent tolerance. We can use the optim- 
set function to set a low maximum tolerance and a less accurate estimate of the root 
results: 


>> options = optimset ('tolx', le-3); 
>> [x, fx] = fzero(@(x) x10-1,0.5,options) 


xX = 
1.0009 


fx = 
0.0090 


6.6 


POLYNOMIALS 


Polynomials are a special type of nonlinear algebraic equation of the general form 


Sl) = yx" + gx" | He tay H aX + Any) (6.12) 


where n is the order of the polynomial, and the a’s are constant coefficients. In many (but 
not all) cases, the coefficients will be real. For such cases, the roots can be real and/or com- 
plex. In general, an nth order polynomial will have n roots. 

Polynomials have many applications in engineering and science. For example, 
they are used extensively in curve fitting. However, one of their most interesting and 
powerful applications is in characterizing dynamic systems—and, in particular, lin- 
ear systems. Examples include reactors, mechanical devices, structures, and electrical 
circuits. 
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6.6.1 MATLAB Function: roots 


If you are dealing with a problem where you must determine a single real root of a polyno- 
mial, the techniques such as bisection and the Newton-Raphson method can have utility. 
However, in many cases, engineers desire to determine all the roots, both real and complex. 
Unfortunately, simple techniques like bisection and Newton-Raphson are not available for 
determining all the roots of higher-order polynomials. However, MATLAB has an excel- 
lent built-in capability, the roots function, for this task. 

The roots function has the syntax, 


X = roots(c) 


where x is a column vector containing the roots and c is a row vector containing the poly- 
nomial’s coefficients. 

So how does the roots function work? MATLAB is very good at finding the eigen- 
values of a matrix. Consequently, the approach is to recast the root evaluation task as an 
eigenvalue problem. Because we will be describing eigenvalue problems later in the book, 
we will merely provide an overview here. 

Suppose we have a polynomial 


A,X + 5X1 + a,x + aX +.a5x +a, =0 (6.13) 
Dividing by a, and rearranging yields 


5 a6 
y= 
1 ay 


x= x4 -3 


5 Gy dy G ac ei 
a a a a 


A special matrix can be constructed by using the coefficients from the right-hand side as 
the first row and with 1’s and 0’s written for the other rows as shown: 


—4,/a, —a3/a, —a,/a, —a;/a, —4,/a, 
1 0 0 0 0 
0 1 0 0 0 (6.14) 
0 0 1 0 0 
0 0 0 1 0 


Equation (6.14) is called the polynomial’s companion matrix. It has the useful prop- 
erty that its eigenvalues are the roots of the polynomial. Thus, the algorithm underlying 
the roots function consists of merely setting up the companion matrix and then using 
MATLAB’s powerful eigenvalue evaluation function to determine the roots. Its applica- 
tion, along with some other related polynomial manipulation functions, are described in 
the following example. 

We should note that roots has an inverse function called poly, which when passed the 
values of the roots, will return the polynomial’s coefficients. Its syntax is 


c = poly(r) 


where ris a column vector containing the roots and c is a row vector containing the poly- 
nomial’s coefficients. 
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EXAMPLE 6.8 


Using MATLAB to Manipulate Polynomials and Determine Their Roots 
Problem Statement. Use the following equation to explore how MATLAB can be em- 
ployed to manipulate polynomials: 


f(x) =x? — 3.5x4 + 2.75xX + 2.125x? — 3.875x + 1.25 (E6.8.1) 


Note that this polynomial has three real roots: 0.5, —1.0, and 2; and one pair of complex 
roots: 1 + 0.5i. 


Solution. Polynomials are entered into MATLAB by storing the coefficients as a row vec- 
tor. For example, entering the following line stores the coefficients in the vector a: 


>> a = [1 -3.5 2.75 2.125 -3.875 1.25]; 
We can then proceed to manipulate the polynomial. For example we can evaluate it at 
x = 1, by typing 
>> polyval(a,1) 
with the result, 1(1)° — 3.5(1)* + 2.75(1)° + 2.125(1)* — 3.875(1) + 1.25 = —0.25: 
ans = 
-0.2500 


We can create a quadratic polynomial that has roots corresponding to two of the origi- 
nal roots of Eq. (E6.8.1): 0.5 and —1. This quadratic is (x — 0.5)(x + 1) = x? + 0.5x — 0.5. 
It can be entered into MATLAB as the vector b: 


>> b = [1 .5 -.5] 


b= 
1.0000 0.5000 -0.5000 


Note that the poly function can be used to perform the same task as in 


>> b = poly([0.5 -1]) 


b= 
1.0000 0.5000 -0.5000 


We can divide this polynomial into the original polynomial by 
>> [q,r] = deconv(a,b) 
with the result being a quotient (a third-order polynomial, q) and a remainder (r) 
q= 
1.0000 -4.0000 5.2500 -2.5000 
r= 


0 0 0 0 0 0 


Because the polynomial is a perfect divisor, the remainder polynomial has zero coeffi- 
cients. Now, the roots of the quotient polynomial can be determined as 


>> x = roots(q) 


186 ROOTS: OPEN METHODS 


with the expected result that the remaining roots of the original polynomial Eq. (E6.8.1) 
are found: 


xX = 
2.0000 

1.0000 + 0.5000 

1.0000 - 0.5000: 

We can now multiply q by b to come up with the original polynomial: 
>> a = conv(q,b) 


a = 
1.0000 -3.5000 2.7500 2.1250 -3.8750 1.2500 

We can then determine all the roots of the original polynomial by 
>> x = roots(a) 
xX = 

2.0000 

-1.0000 

1.0000 + 0.5000i 


1.0000 - 0.50001 
0.5000 


Finally, we can return to the original polynomial again by using the poly function: 


>> a = poly(x) 


a = 
1.0000 -3.5000 2.7500 2.1250 -3.8750 1.2500 


6.7 CASE STUDY PIPE FRICTION 


Background. Determining fluid flow through pipes and tubes has great relevance in 
many areas of engineering and science. In engineering, typical applications include the 
flow of liquids and gases through pipelines and cooling systems. Scientists are interested 
in topics ranging from flow in blood vessels to nutrient transmission through a plant’s 
vascular system. 

The resistance to flow in such conduits is parameterized by a dimensionless number 
called the friction factor. For turbulent flow, the Colebrook equation provides a means to 
calculate the friction factor: 


OS Olos (D +251 | (6.15) 


y O E 
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6.7 CASE STUDY continued 


where e = the roughness (m), D = diameter (m), and Re = the Reynolds number: 


_ PVD 

=E 

where p = the fluid’s density (kg/m*), V = its velocity (m/s), and u = dynamic viscosity 
(N-s/m?). In addition to appearing in Eq. (6.15), the Reynolds number also serves as the 
criterion for whether flow is turbulent (Re > 4000). 

In this case study, we will illustrate how the numerical methods covered in this part of 
the book can be employed to determine f for air flow through a smooth, thin tube. For this 
case, the parameters are p = 1.23 kg/m’, u = 1.79 x 10° N-s/m’, D = 0.005 m, V = 40 m/s, 
and £e = 0.0015 mm. Note that friction factors range from about 0.008 to 0.08. In addition, 
an explicit formulation called the Swamee-Jain equation provides an approximate estimate: 


Re 


fe 1.325 (6.16) 


Solution. The Reynolds number can be computed as 


_ PVD _ 1.23(40)0.005 


Re 
K 1.79 x 10> 


= 13,743 


This value along with the other parameters can be substituted into Eq. (6.15) to give 


il 0.0000015 , 2.51 
220m + 
MS Ta ©(3.70.005) 13.74347 


Before determining the root, it is advisable to plot the function to estimate initial 
guesses and to anticipate possible difficulties. This can be done easily with MATLAB: 


>> rho=1.23;mu=1.79e-5;D=0.005;V=40;e=0.0015/1000; 

>> Re=rho*V*D/mu; 

g=@(f) 1/sqrt(f)+2*10g10(e/(3.7*D)+2.51/(Re*sqrt(f))); 
fplot(g,[0.008 0.08]),grid,xlabel('f'),ylabel('g(f)') 


Wis Vi 
v v 


As in Fig. 6.11, the root is located at about 0.03. 

Because we are supplied initial guesses (x, = 0.008 and x, = 0.08), either of the brack- 
eting methods from Chap. 5 could be used. For example, the bisect function developed in 
Fig. 5.7 gives a value of f = 0.0289678 with a percent relative error of error of 5.926 x 10° 
in 22 iterations. False position yields a result of similar precision in 26 iterations. Thus, 
although they produce the correct result, they are somewhat inefficient. This would not be 
important for a single application, but could become prohibitive if many evaluations were 
made. 
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6.7 CASE STUDY continued 
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FIGURE 6.11 


We could try to attain improved performance by turning to an open method. Because 
Eq. (6.15) is relatively straightforward to differentiate, the Newton-Raphson method is a 
good candidate. For example, using an initial guess at the lower end of the range (x) = 
0.008), the newtraph function developed in Fig. 6.7 converges quickly: 


>> dg=@(f) -2/1o0g(10)*1.255/Re*f*(-3/2)/(e/D/3.7 ... 
+2.51/Re/sqrt(f))-0.5/f4(3/2); 
>> [f ea iter] =newtraph(g,dg,0.008) 


p= 
0.02896781017144 
ea = 
6.870124190058040e-006 
iter = 
6 


However, when the initial guess is set at the upper end of the range (x) = 0.08), the routine 
diverges, 


>> [f ea iter] =newtraph(g,dg,0.08) 
pa 
NaN + NaNi 


As can be seen by inspecting Fig. 6.11, this occurs because the function’s slope at the 
initial guess causes the first iteration to jump to a negative value. Further runs demonstrate 
that for this case, convergence only occurs when the initial guess is below about 0.066. 
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6.7 CASE STUDY continued 


So we can see that although the Newton-Raphson is very efficient, it requires good ini- 
tial guesses. For the Colebrook equation, a good strategy might be to employ the Swamee- 
Jain equation (Eq. 6.16) to provide the initial guess as in 


>> fSJ=1.325/log(e/(3.7*D)+5.74/Re40.9)42 


fSJ = 
0.02903099711265 


>> [f ea iter] =newtraph(g,dg,fSJ) 


y= 
0.02896781017144 
ea = 
8.510189472800060e-010 
iter = 
3 


Aside from our homemade functions, we can also use MATLAB’s built-in fzero func- 
tion. However, just as with the Newton-Raphson method, divergence also occurs when 
fzero function is used with a single guess. However, in this case, guesses at the lower end 
of the range cause problems. For example, 


>> fzero(g,0.008) 


Exiting fzero: aborting search for an interval containing a sign 
change because complex function value encountered ... 
during search. 
(Function value at -0.0028 is -4.92028-20.24231.) 
Check function or try again with a different starting value. 
ans = 
NaN 


If the iterations are displayed using optimset (recall Example 6.7), it is revealed that a nega- 
tive value occurs during the search phase before a sign change is detected and the routine 
aborts. However, for single initial guesses above about 0.016, the routine works nicely. 
For example, for the guess of 0.08 that caused problems for Newton-Raphson, fzero does 
just fine: 


>> fzero(g,0.08) 


ans = 
0.02896781017144 


As a final note, let’s see whether convergence is possible for simple fixed-point it- 
eration. The easiest and most straightforward version involves solving for the first f in 
Eq. (6.15): 


fin = O25 (6.17) 


2.51 \\2 
[es [arp ap Revi 
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FIGURE 6.12 


The two-curve display of this function depicted indicates a surprising result (Fig. 6.12). 
Recall that fixed-point iteration converges when the y, curve has a relatively flat slope (i.e., 
Ig’(€)| < 1). As indicated by Fig. 6.12, the fact that the y, curve is quite flat in the range 
from f = 0.008 to 0.08 means that not only does fixed-point iteration converge, but it con- 
verges fairly rapidly! In fact, for initial guesses anywhere between 0.008 and 0.08, fixed- 
point iteration yields predictions with percent relative errors less than 0.008% in six or 
fewer iterations! Thus, this simple approach that requires only one guess and no derivative 
estimates performs really well for this particular case. 

The take-home message from this case study is that even great, professionally devel- 
oped software like MATLAB is not always foolproof. Further, there is usually no single 
method that works best for all problems. Sophisticated users understand the strengths and 
weaknesses of the available numerical techniques. In addition, they understand enough 
of the underlying theory so that they can effectively deal with situations where a method 
breaks down. 
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6.1 Employ fixed-point iteration to locate the root of 


f(x) = sin (Vx) -x 


Use an initial guess of x) = 0.5 and iterate until £, < 0.01%. 

Verify that the process is linearly convergent as described at 

the end of Sec. 6.1. 

6.2 Use (a) fixed-point iteration and (b) the Newton- 

Raphson method to determine a root of f(x) = —0.9x* + 1.7x + 

2.5 using Xx) = 5. Perform the computation until €, is less 

than e, = 0.01%. Also check your final answer. 

6.3 Determine the highest real root of f(x) = xX? — 6x? + 

11x - 6.1: 

(a) Graphically. 

(b) Using the Newton-Raphson method (three iterations, 
Xo = 3.5 

(c) Using the secant method (three iterations, x~! = 2.5 and 
Xo = 3.5). 

(d) Using the modified secant method (three iterations, 
X) = 3.5, 6=0.01). 

(e) Determine all the roots with MATLAB. 

6.4 Determine the lowest positive root 

7 sin(x)e* — 1: 

(a) Graphically. 

(b) Using the Newton-Raphson method (three iterations, 
Xo = 0.3). 

(c) Using the secant method (three iterations, x~! = 0.5 and 
Xo = 0.4. 

(d) Using the modified secant method (five iterations, 
Xo = 0.3, 6 = 0.01). 

6.5 Use (a) the Newton-Raphson method and (b) the modi- 

fied secant method (6 = 0.05) to determine a root of f(x) = 

xX — 16.05x* + 88.75x* — 192.0375x° + 116.35x + 31.6875 

using an initial guess of x = 0.5825 and €, = 0.01%. Explain 

your results. 

6.6 Develop an M-file for the secant method. Along with 

the two initial guesses, pass the function as an argument. 

Test it by solving Prob. 6.3. 

6.7 Develop an M-file for the modified secant method. 

Along with the initial guess and the perturbation frac- 

tion, pass the function as an argument. Test it by solving 

Prob. 6.3. 

6.8 Differentiate Eq. (E6.4.1) to get Eq. (E6.4.2). 

6.9 Employ the Newton-Raphson method to determine a 

real root for f(x) = —2 + 6x — 4x* + 0.5x°, using an ini- 

tial guess of (a) 4.5 and (b) 4.43. Discuss and use graphical 

and analytical methods to explain any peculiarities in your 

results. 


of f(x) = 


6.10 The “divide and average” method, an old-time method 
for approximating the square root of any positive number a, 
can be formulated as 
_ x, +a/x, 

Xin = a 
Prove that this formula is based on the Newton-Raphson 
algorithm. 
6.11 (a) Apply the Newton-Raphson method to the function 
Jœ = tanh(x? — 9) to evaluate its known real root at x = 3. 
Use an initial guess of x) = 3.2 and take a minimum of three 
iterations. (b) Did the method exhibit convergence onto its real 
root? Sketch the plot with the results for each iteration labeled. 
6.12 The polynomial f(x) = 0.0074x* — 0.284x7 + 3.355x°- 
12.183x + 5 has a real root between 15 and 20. Apply the 
Newton-Raphson method to this function using an initial 
guess of x, =16.15. Explain your results. 
6.13 Mechanical engineers, as well as most other engineers, 
use thermodynamics extensively in their work. The follow- 
ing polynomial can be used to relate the zero-pressure spe- 
cific heat of dry air c, in kJ/(kg K) to temperature in K: 


c, = 0.99403 + 1.671 x 107 + 9.7215 x 10°77 
—9.5838 x 107"? + 1.9520 x 10-47" 


Write a MATLAB script (a) to plot c, versus a range of T= 0 
to 1200 K and (b) to determine the temperature that corre- 
sponds to a specific heat of 1.1 kJ/(kg K) with MATLAB 
polynomial functions. 

6.14 In a chemical engineering process, water vapor (H,O) 
is heated to sufficiently high temperatures that a significant 
portion of the water dissociates, or splits apart, to form oxy- 
gen (O,) and hydrogen (H,): 


H,O2H, +40, 


If it is assumed that this is the only reaction involved, the 
mole fraction x of H,O that dissociates can be represented by 


| 2p, 
l-x V2+x 


K=—* 


(P6.14.1) 


where K is the reaction’s equilibrium constant and p, is the 
total pressure of the mixture. If p, = 3 atm and K = 0.05, 
determine the value of x that satisfies Eq. (P6.14.1). 

6.15 The Redlich-Kwong equation of state is given by 


p= RT _ a 
v-b vw+b) VT 
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FIGURE P6.17 


where R = the universal gas constant [= 0.518 kJ/(kg K)], 
T = absolute temperature (K), p = absolute pressure (kPa), 
and v = the volume of a kg of gas (m*/kg). The parameters a 
and b are calculated by 


RT 
a = 0.427 — 


25 
g 


T, 

2 b = 0.0866R A 

where p, = 4600 kPa and T, = 191 K. As a chemical engi- 
neer, you are asked to determine the amount of methane fuel 
that can be held in a 3-m° tank at a temperature of—40 °C 
with a pressure of 65,000 kPa. Use a root-locating method 
of your choice to calculate v and then determine the mass of 
methane contained in the tank. 

6.16 The volume of liquid V in a hollow horizontal cylinder 
of radius r and length L is related to the depth of the liquid h by 


V= [r2cos7! (==) = (r — h) V2rh — als 


Determine h given r = 2 m, L = 5 m, and V = 8 m°. 

6.17 A catenary cable is one which is hung between 
two points not in the same vertical line. As depicted in 
Fig. P6.17a, it is subject to no loads other than its own 
weight. Thus, its weight acts as a uniform load per unit 
length along the cable w (N/m). A free-body diagram of a 
section AB is depicted in Fig. P6.17b, where T, and T, are 
the tension forces at the end. Based on horizontal and verti- 
cal force balances, the following differential equation model 
of the cable can be derived: 


2 2 
d _w I, i (2) 
dx T, dx 


ris? 


W=ws 
T 


(b) 


Calculus can be employed to solve this equation for the 

height of the cable y as a function of distance x: 

Ty 

w Ti w 

(a) Use a numerical method to calculate a value for the 
parameter T, given values for the parameters w = 10 and 
Yo = 5, such that the cable has a height of y = 15 at x = 50. 

(b) Develop a plot of y versus x for x = —50 to 100. 

6.18 An oscillating current in an electric circuit is described 

by I = 9e™ sin(2zt), where t is in seconds. Determine all 

values of ¢ such that 7 = 3.5 

6.19 Figure P6.19 shows a circuit with a resistor, an induc- 

tor, and a capacitor in parallel. Kirchhoff’s rules can be used 

to express the impedance of the system as 


T, 
y =- cosh (7,*| + yy 


7 


dy) =e 
z Vet (wc an 
where Z = impedance (Q), and w is the angular frequency. Find 
the @ that results in an impedance of 100 Q using the fzero 
function with initial guesses of 1 and 1000 for the following 
parameters: R = 225 Q, C= 0.6 x 10-° F, and L = 0.5 H. 


l 


C 


FIGURE P6.19 
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(a) (6) 
FIGURE P6.20 


6.20 Real mechanical systems may involve the deflection 
of nonlinear springs. In Fig. P6.20, a block of mass m is 
released a distance h above a nonlinear spring. The resis- 
tance force F of the spring is given by 


F=-(k,d + kd?) 


Conservation of energy can be used to show that 


5/2 
0= ae + E kıď — mgd — mgh 

Solve for d, given the following parameter values: k, = 
40,000 g/s’, k, = 40 els? m°>), m = 95 g,g=9.81 m/s’, and 
h=0.43 m. 

6.21 Aerospace engineers sometimes compute the trajec- 
tories of projectiles such as rockets. A related problem deals 
with the trajectory of a thrown ball. The trajectory of a ball 
thrown by a right fielder is defined by the (x, y) coordinates 
as displayed in Fig. P6.21. The trajectory can be modeled as 


y = (tan 0x — Z3 z, * h 
205 cos’ Oy 
Ya 
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FIGURE P6.21 


Find the appropriate initial angle 0, if v) = 30 m/s, and the 
distance to the catcher is 90 m. Note that the throw leaves the 
right fielder’s hand at an elevation of 1.8 m and the catcher 
receives it at 1 m. 
6.22 You are designing a spherical tank (Fig. P6.22) to hold 
water for a small village in a developing country. The vol- 
ume of liquid it can hold can be computed as 
ah? BR hl 

3 
where V = volume [m°], A = depth of water in tank [m], and 
R = the tank radius [m]. 


V= 


FIGURE P6.22 


If R = 3 m, what depth must the tank be filled to so 
that it holds 30 m°? Use three iterations of the most efficient 
numerical method possible to determine your answer. De- 
termine the approximate relative error after each iteration. 
Also, provide justification for your choice of method. Extra 
information: (a) For bracketing methods, initial guesses of 
0 and R will bracket a single root for this example. (b) For 
open methods, an initial guess of R will always converge. 
6.23 Perform the identical MATLAB operations as those 
in Example 6.8 to manipulate and find all the roots of the 
polynomial 


f(x) = (& + Da + SY — 6) — 4)(x — 8) 


6.24 In control systems analysis, transfer functions are 
developed that mathematically relate the dynamics of a sys- 
tem’s input to its output. A transfer function for a robotic 
positioning system is given by 


C(s) _ s? + 9s? + 265 +24 


G(s) = = 
©) N(s) st + 155° + 77s? + 153s + 90 
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where G(s) = system gain, C(s) = system output, M(s) = 
system input, and s = Laplace transform complex frequency. 
Use MATLAB to find the roots of the numerator and de- 
nominator and factor these into the form 


(s + a,)(S + a)\(s + a3) 
(s + b,)(s + by)(s + b3)(s + by) 


where a; and b, = the roots of the numerator and denomina- 
tor, respectively. 
6.25 The Manning equation can be written for a rectangular 
open channel as 


_ VS (BHy”? 
~ n(B + 2H 


where Q = flow (m°/s), S = slope (m/m), H = depth (m), 
and n = the Manning roughness coefficient. Develop a fixed- 
point iteration scheme to solve this equation for H given 
Q = 5, S = 0.0002, B = 20, and n = 0.03. Perform the compu- 
tation until €, is less than £, = 0.05%. Prove that your scheme 
converges for all initial guesses greater than or equal to zero. 
6.26 See if you can develop a foolproof function to com- 
pute the friction factor based on the Colebrook equation as 
described in Sec. 6.7. Your function should return a precise 
result for Reynolds number ranging from 4000 to 10” and for 
e/D ranging from 0.00001 to 0.05. 

6.27 Use the Newton-Raphson method to find the root of 


G(s) = 


fa) =e (4-x)-2 


Employ initial guesses of (a) 2, (b) 6, and (c) 8. Explain 
your results. 
6.28 Given 


f(x) = —2x° — 1.5x4 + 10x +2 


Use a root-location technique to determine the maximum of 
this function. Perform iterations until the approximate relative 
error falls below 5%. If you use a bracketing method, use initial 
guesses of x, = 0 and x, = 1. If you use the Newton-Raphson or 
the modified secant method, use an initial guess of x, = 1. If you 
use the secant method, use initial guesses of x,_, =O andx,;= 1. 
Assuming that convergence is not an issue, choose the tech- 
nique that is best suited to this problem. Justify your choice. 
6.29 You must determine the root of the following easily 
differentiable function: 


e5 = 5 — 5x 
Pick the best numerical technique, justify your choice, and 
then use that technique to determine the root. Note that it 
is known that for positive initial guesses, all techniques ex- 
cept fixed-point iteration will eventually converge. Perform 
iterations until the approximate relative error falls below 2%. 


If you use a bracketing method, use initial guesses of x, = 0. 
and x, = 2. If you use the Newton-Raphson or the modified 
secant method, use an initial guess of x; = 0.7. If you use 
the secant method, use initial guesses of x,_, = 0 and x; = 2. 
6.30 (a) Develop an M-file function to implement Brent’s 
root-location method. Base your function on Fig. 6.10, but 
with the beginning of the function changed to 


function [b,fb] = fzeronew(f, x1, xu,varargin) 
% fzeronew: Brent root location zeroes 

% [b,fb] = fzeronew(f,x1,xu,p1,p2,...): 

% uses Brent's method to find the root of f 
% input: 

% f= name of function 

% xl, xu = lower and upper guesses 

% pl,p2,... = additional parameters used by f 
% output: 

% b= real root 

% fb = function value at root 


Make the appropriate modifications so that the function per- 

forms as outlined in the documentation statements. In addi- 

tion, include error traps to ensure that the function’s three 

required arguments (f,x1,xu) are prescribed, and that the 

initial guesses bracket a root. 

(b) Test your function by using it to solve for the root of the 
function from Example 5.6 using 


>> [x,fx] = fzeronew(@(x,n) x‘n-1,0,1.3,10) 


6.31 Figure P6.31 shows a side view of a broad crested 
weir. The symbols shown in Fig. P6.31 are defined as: H,, = 
the height of the weir (m), H, = the head above the weir 
(m), and H = H, + H, = the depth of the river upstream 
of the weir (m). 


FIGURE P6.31 
A broad-crested weir used to control depth and 
velocity of rivers and streams. 


The flow across the weir, Q,, (m*/s), can be computed as 
(Munson et al., 2009) 


3/2 


3/2 
0,,= CB VB (3) A; (P6.31.1) 
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where C,, = a weir coefficient (dimensionless), B,, = the 
weir width (m), and g = the gravitational constant (m/s’). C, 
can be determined using the weir height (H,,) as in 


1+ H,,/H,, 
C,, = 1.125 IFA, JA. 
Given g = 9.81 m/s’, H,, = 0.8 m, B, = 8 m, and Q, = 1.3 
m*/s, determine the upstream depth, H, using (a) the modi- 
fied secant method with 6 = 10%, (b) fixed-point iteration, 
and (c) the MATLAB function fzero. For all cases employ 
an initial guess of 0.5H,,, which for this case is 0.4. For (b) 
also prove that your result will be convergent for positive 
initial guesses. 
6.32 The following reversible chemical reaction describes 
how gaseous phases of methane and water react to form car- 
bon dioxide and hydrogen in a closed reactor, 


(P6.31.2) 


CH, + 2H,O = CO, + 4H, 
with the equilibrium relationship 
_ [CO,] [H,}* 
[CH,][H, 07° 


where K = the equilibrium coefficient and the brackets 
[] designate molar concentrations (mole/L). Conserva- 
tion of mass can be used to reformulate the equilibrium 
relationship as 


Oe 
Bae 


where x = the number of moles created in the forward direc- 
tion (mole), V = the volume of the reactor (L), and M; = the 
initial number of moles of constituent i (mole). Given that 
K=7x 10°, V=20L, and Mcu, =My) o = 1 moles, deter- 
mine x using (a) fixed-point iteration and (b) fzero. 

6.33 The concentration of pollutant bacteria c in a lake de- 
creases according to 


c= Te} ay 20e7 0-08" 


Determine the time required for the bacteria concentration to 
be reduced to 15 using the Newton-Raphson method with an 
initial guess of t = 6 and a stopping criterion of 1%. Check 
your result with fzero. 

6.34 You are asked to solve for the root of the following 
equation with fixed-point iteration: 


x*=5x+10 


Determine the solution approach that converges for initial 
guesses in the range of 0 < x < 7. Use either a graphical or 
analytical approach to prove that your formulation always 
converges in the given range. 

6.35 A circular pipe made out of new cast iron is used to 
convey water at a volume flow rate of Q = 0.3 m/s. Assume 
that the flow is steady and fully developed and the water 
is incompressible. The head loss, friction, and diameter are 
related by the Darcy-Weisbach equation, 


-iV 
h =f Dig (P6.35.1) 
where f = the friction factor (dimensionless), L = length (m), 


D = pipe inner diameter (m), V = velocity (m/s), and g = 
the gravitational constant (= 9.81 m/s”). The velocity can be 
related to flow by 


OQ=AV 


c 


(P6.35.2) 


where A, = the pipe’s cross-sectional area (m°) = 2D*/4 and 
the friction factor can be determined by the Colebrook equa- 
tion. If you want the head loss to be less than 0.006 m per 
meter of pipe, develop a MATLAB function to determine 
the smallest diameter pipe to achieve this objective. Use the 
following parameter values: v = 1.16 x 10% m7/s and ¢ = 
0.4 mm. 

6.36 Figure P6.36 shows an asymmetric diamond-shaped 
supersonic airfoil. The orientation of the airfoil relative to 
the airflow is represented by a number of angles: a = the 
angle of attack, 6 = the shock angle, 0 = the deflection 
angle, with the subscripts “7” and “u” designating the lower 
and upper surfaces of the airfoil. The following formula 
relates the deflection angle to the oblique shock angle and 
speed, 


2cot pM? sin? £ — 1) 


tan 0 = 
M°(k + cos2 f + 2) 


FIGURE P6.36 
A diamond-shaped airfoil. 
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where M = the Mach number which is the ratio of the jet’s 
speed, v (m/s), to the speed of sound, c (m/s), where 


c= VERT, 


where k = the ratio of specific heats which for air is c,/c, (= 
1.4), R = the air gas constant (= 287 N m/(kg K)), and T, = 
the air’s absolute temperature (K). Given estimates of M, k, 
and @, the shock angle can be determined as the root of 


2cot (M? sin? $ — 1) 
M’(k + cos2 f + 2) 


fA) = an0 
The pressure on the airfoil surface, p, (kPa), can then be 
computed as 


2k p2 _ k—-1 
Pa =P (M sinp - +) 

Suppose that the airfoil is attached to a jet traveling at a 
speed v = 625 m/s through air with a temperature T = 4 °C, 
pressure p = 110 kPa, and 6, = 4°. Develop a MATLAB 
script to (a) generate a plot of f(Z,) versus J, = 2° to 88°, and 
(b) compute the pressure on the upper surface of the airfoil. 
6.37 As described in Sec. 1.4, for objects falling through 
fluids at very low speeds, the flow regime around the object 
will be laminar and the relationship between the drag force 
and velocity is linear. In addition, in such cases, the buoy- 
ancy force must also be included. For such cases, a force 
balance can be written as 


diy. AV, 2 fs 


dt m m 


(gravity) (buoyancy) (drag) 


(P6.37) 


where v = velocity (m/s), t = time (s), m = the mass of the 
particle (kg), g = the gravitational constant (= 9.81 m/s’), 
p; = fluid density (kg/m), V = particle volume (m°), and 
c4 = the linear drag coefficient (kg/m). Note that the mass of 
the particle can be computed as Vp,, where p, = the density 
of the particle (kg/m*). For a small sphere, Stokes developed 
the following formula for the drag coefficient, c} = 6apr, 
where u = the fluid’s dynamic viscosity (N s/m°), and r = 
the sphere’s radius (m). 


You release an iron sphere at the surface of a container 
(x = 0) filled with honey (Fig. P6.37) and then measure how 
long it takes to settle to the bottom (x = L). Use this infor- 
mation to estimate the honey’s viscosity based on the fol- 
lowing parameter values: p, = 1420 kg/m’, p, = 7850 kg/ 
mô, r= 0.02 m, L = 0.5 m, and t(x = 0.5) = 3.6 s. Check the 
Reynolds number (Re = p;vd/u, where d = diameter) to con- 
firm that laminar conditions occurred during the experiment. 


[Hint: The problem can be solved by integrating Eq. (P6.37) 
two times to yield an equation for x as a function of t.] 


tx 


FIGURE P6.37 
A sphere settling in a cylinder filled with viscous honey. 


6.38 As depicted in Fig. P6.38a, a scoreboard is suspended 
above a sports arena by two cables, pinned at A, B, and C. 
The cables are initially horizontal and of length L. After the 
scoreboard is hung, a free-body diagram at node B can be de- 
veloped as shown in Fig. P6.38b. Assuming that the weight of 
each cable is negligible, determine the deflection, d (m), that 
results if the scoreboard weighs W = 9000 N. Also, compute 
how much each cable is elongated. Note that each cable obeys 
Hooke’s law such that the axial elongation is represented by 
L' — L = FLI(A,E) where F = the axial force (N), A, = the ca- 
ble’s cross-sectional area (m°), and E = the modulus of elastic- 
ity (N/m”). Use the following parameters for your calculations: 
L=45 m, A, = 6.362 x 107 m’, and E = 1.5 x 10"! N/m’. 


FIGURE P6.38 (b) 

(a) Two thin cables pinned at A, B, and C with a score- 
board suspended from B. (b) Free-body diagram of the 
pin at B after the scoreboard is hung. 
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6.39 A water tower is connected to a pipe with a valve at its 
end as depicted in Fig. P6.39. Under a number of simplify- 
ing assumptions (e.g., minor friction losses neglected), the 
following energy balance can be written 


L+h Lee =u) v? v? 
F] + + 7 +K 7 


2 
_v = 
a 5f dd 


where g = gravitational acceleration (= 9.81 m/s’), h = 
tower height (m), v = mean water velocity in pipe (m/s), 
f= the pipe’s friction factor, L = horizontal pipe length (m), 


FIGURE P6.39 
A water tower connected to a pipe with a valve at 
its end. 


d = pipe diameter (m), L, , = equivalent length for the elbow 
(m), L., = equivalent length for the valve (m), and K = loss 
coefficient for the contraction at the bottom of the tank. 
Write a MATLAB script to determine the flow exiting the 
valve, Q (m°/s), using the following parameter values: h = 
24 m, L = 65 m, d = 100 mn, L,,/d = 30, L,,,/d = 8, and 
K =0.5. In addition, the kinematic viscosity of water is v = 
pip =1.2x 10% m/s. 

6.40 Modify the fzerosimp function (Fig. 6.10) so that it 
can be passed any function with a single unknown and uses 
varargin to pass the function’s parameters. Then test it with 
the following script to obtain a solution for pipe friction 
based on Case Study 6.3, 


clc 

format long, format compact 
rho=1.23;mu=1.79e-5;D=0.005;V=40;e=0.0015/1000; 
Re=rho*V*D/mu; 

g=@(f,e,D) 1/sqrt(f)+2*1o0g10(e/(3.7*D)+2.51/ 
(Re*sqrt(f))); 

f=fzerosimp(@(x) g(x,e,D),0.008,0.08) 


Optimization 


CHAPTER OBJECTIVES 


The primary objective of this chapter is to introduce you to how optimization can be 
used to determine minima and maxima of both one-dimensional and multidimensional 
functions. Specific objectives and topics covered are 


Understanding why and where optimization occurs in engineering and scientific 
problem solving. 

Recognizing the difference between one-dimensional and multidimensional 
optimization. 

Distinguishing between global and local optima. 

Knowing how to recast a maximization problem so that it can be solved with a 
minimizing algorithm. 

Being able to define the golden ratio and understand why it makes one- 
dimensional optimization efficient. 

Locating the optimum of a single-variable function with the golden-section search. 
Locating the optimum of a single-variable function with parabolic interpolation. 
Knowing how to apply the fminbnd function to determine the minimum of a 
one-dimensional function. 

Being able to develop MATLAB contour and surface plots to visualize two- 
dimensional functions. 

Knowing how to apply the fminsearch function to determine the minimum of a 
multidimensional function. 


YOU’VE GOT A PROBLEM 


n object like a bungee jumper can be projected upward at a specified velocity. If it 
is subject to linear drag, its altitude as a function of time can be computed as 
mg 


z= y+ (vp + 28) (1 — em) — 8, (7.1) 
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Elevation as a function of time for an object initially projected upward with an initial velocity. 


where z = altitude (m) above the earth’s surface (defined as z = 0), zọ = the initial altitude 
(m), m = mass (kg), c = a linear drag coefficient (kg/s), vy = initial velocity (m/s), and t = 
time (s). Note that for this formulation, positive velocity is considered to be in the upward 
direction. Given the following parameter values: g = 9.81 m/s’, z) = 100 m, vo = 55 m/s, 
m = 80 kg, and c = 15 kg/s, Eq. (7.1) can be used to calculate the jumper’s altitude. As 
displayed in Fig. 7.1, the jumper rises to a peak elevation of about 190 m at about t= 4 s. 

Suppose that you are given the job of determining the exact time of the peak elevation. 
The determination of such extreme values is referred to as optimization. This chapter will 
introduce you to how the computer is used to make such determinations. 


INTRODUCTION AND BACKGROUND 


In the most general sense, optimization is the process of creating something that is as 
effective as possible. As engineers, we must continuously design devices and products 
that perform tasks in an efficient fashion for the least cost. Thus, engineers are always 
confronting optimization problems that attempt to balance performance and limitations. In 
addition, scientists have interest in optimal phenomena ranging from the peak elevation of 
projectiles to the minimum free energy. 

From a mathematical perspective, optimization deals with finding the maxima and 
minima of a function that depends on one or more variables. The goal is to determine the 
values of the variables that yield maxima or minima for the function. These can then be 
substituted back into the function to compute its optimal values. 

Although these solutions can sometimes be obtained analytically, most practical 
optimization problems require numerical, computer solutions. From a numerical stand- 
point, optimization is similar in spirit to the root-location methods we just covered in 
Chaps. 5 and 6. That is, both involve guessing and searching for a point on a function. The 
fundamental difference between the two types of problems is illustrated in Fig. 7.2. Root 
location involves searching for the location where the function equals zero. In contrast, 
optimization involves searching for the function’s extreme points. 
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FIGURE 7.2 
A function of a single variable illustrating the difference between roots and optima. 


As can be seen in Fig. 7.2, the optimums are the points where the curve is flat. In math- 
ematical terms, this corresponds to the x value where the derivative f’(x) is equal to zero. 
Additionally, the second derivative, f” (x), indicates whether the optimum is a minimum or a 
maximum: if f” (x) < 0, the point is a maximum; if f”(x) > 0, the point is a minimum. 

Now, understanding the relationship between roots and optima would suggest a pos- 
sible strategy for finding the latter. That is, you can differentiate the function and locate the 
root (i.e., the zero) of the new function. In fact, some optimization methods do just this by 
solving the root problem: f'(x) = 0. 


Determining the Optimum Analytically by Root Location 


Problem Statement. Determine the time and magnitude of the peak elevation based on 
Eq. (7.1). Use the following parameter values for your calculation: g = 9.81 m/s”, zọ= 100m, 
Vy = 55 m/s, m = 80 kg, and c = 15 kg/s. 


Solution. Equation (7.1) can be differentiated to give 


E = pge elt — TE (q — e/m) (E7.1.1) 


Note that because v = dz/dt, this is actually the equation for the velocity. The maximum el- 
evation occurs at the value of ¢ that drives this equation to zero. Thus, the problem amounts 
to determining the root. For this case, this can be accomplished by setting the derivative to 
zero and solving Eq. (E7.1.1) analytically for 


r= n (1 +52) 


Substituting the parameters gives 


_ 80 een |= 
=i In(1 +0081) = 383166 s 
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This value along with the parameters can then be substituted into Eq. (7.1) to compute the 
maximum elevation as 


<= 100 + 82 (50 4 SOID) (1 — e7(15780)8:83166) _ ast (3.83166) = 192.8609 m 


We can verify that the result is a maximum by differentiating Eq. (E7.1.1) to obtain 
the second derivative 


dz —_C v en e/mt = eT (e/mt = —9 8] m 
d? m? 8 Te 2 


The fact that the second derivative is negative tells us that we have a maximum. Further, 
the result makes physical sense since the acceleration should be solely equal to the force of 
gravity at the maximum when the vertical velocity (and hence drag) is zero. 

Although an analytical solution was possible for this case, we could have obtained the 
same result using the root-location methods described in Chaps. 5 and 6. This will be left 
as a homework exercise. 


Although it is certainly possible to approach optimization as a roots problem, a variety 
of direct numerical optimization methods are available. These methods are available for both 
one-dimensional and multidimensional problems. As the name implies, one-dimensional 
problems involve functions that depend on a single dependent variable. As in Fig. 7.3a, the 
search then consists of climbing or descending one-dimensional peaks and valleys. Multi- 
dimensional problems involve functions that depend on two or more dependent variables. 


FIGURE 7.3 

(a) One-dimensional optimization. This figure also illustrates how minimization of f(x) is 
equivalent to the maximization of —f(x). (b) Two-dimensional optimization. Note that this 
figure can be taken to represent either a maximization (contours increase in elevation up to 
the maximum like a mountain) or a minimization (contours decrease in elevation down to the 
minimum like a valley). 
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In the same spirit, a two-dimensional optimization can again be visualized as searching out 
peaks and valleys (Fig. 7.3b). However, just as in real hiking, we are not constrained to walk 
a single direction; instead the topography is examined to efficiently reach the goal. 

Finally, the process of finding a maximum versus finding a minimum is essentially 
identical because the same value x“ both minimizes f (x) and maximizes —f (x). This equiva- 
lence is illustrated graphically for a one-dimensional function in Fig. 7.3a. 

In the next section, we will describe some of the more common approaches for one- 
dimensional optimization. Then we will provide a brief description of how MATLAB can 
be employed to determine optima for multidimensional functions. 


ONE-DIMENSIONAL OPTIMIZATION 


This section will describe techniques to find the minimum or maximum of a function of a 
single variable f (x). A useful image in this regard is the one-dimensional “roller coaster’— 
like function depicted in Fig. 7.4. Recall from Chaps. 5 and 6 that root location was com- 
plicated by the fact that several roots can occur for a single function. Similarly, both local 
and global optima can occur in optimization. 

A global optimum represents the very best solution. A local optimum, though not the 
very best, is better than its immediate neighbors. Cases that include local optima are called 
multimodal. In such cases, we will almost always be interested in finding the global optimum. 
In addition, we must be concerned about mistaking a local result for the global optimum. 

Just as in root location, optimization in one dimension can be divided into bracketing 
and open methods. As described in the next section, the golden-section search is an example 
of a bracketing method that is very similar in spirit to the bisection method for root location. 
This is followed by a somewhat more sophisticated bracketing approach—parabolic inter- 
polation. We will then show how these two methods are combined and implemented with 
MATLAB’s fminbnd function. 


FIGURE 7.4 

A function that asymptotically approaches zero at plus and minus œ and has two maximum 
and two minimum points in the vicinity of the origin. The two points to the right are local op- 
tima, whereas the two to the left are global. 
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7.2.1 Golden-Section Search 


In many cultures, certain numbers are ascribed magical qualities. For example, we in the 
West are all familiar with “lucky 7” and “Friday the 13th.” Beyond such superstitious quan- 
tities, there are several well-known numbers that have such interesting and powerful math- 
ematical properties that they could truly be called “magical.” The most common of these are 
the ratio of a circle’s circumference to its diameter z and the base of the natural logarithm e. 

Although not as widely known, the golden ratio should surely be included in the pan- 
theon of remarkable numbers. This quantity, which is typically represented by the Greek 
letter @ (pronounced: fee), was originally defined by Euclid (ca. 300 BCE) because of 
its role in the construction of the pentagram or five-pointed star. As depicted in Fig. 7.5, 
Euclid’s definition reads: “A straight line is said to have been cut in extreme and mean ratio 
when, as the whole line is to the greater segment, so is the greater to the lesser.” 

The actual value of the golden ratio can be derived by expressing Euclid’s definition as 


ty, +%, fi 

=e oe (7.2) 
Multiplying by 7,/7/, and collecting terms yields 

#-p-1=0 (1.3) 
where ġ = 7,/@,. The positive root of this equation is the golden ratio: 

p= ua = 1.61803398874989 ... (7.4) 


The golden ratio has long been considered aesthetically pleasing in Western cultures. 
In addition, it arises in a variety of other contexts including biology. For our purposes, it 
provides the basis for the golden-section search, a simple, general-purpose method for de- 
termining the optimum of a single-variable function. 

The golden-section search is similar in spirit to the bisection approach for locating 
roots in Chap. 5. Recall that bisection hinged on defining an interval, specified by a lower 
guess (x,) and an upper guess (x,) that bracketed a single root. The presence of a root be- 
tween these bounds was verified by determining that f(x,) and f (x,,) had different signs. The 
root was then estimated as the midpoint of this interval: 

_ x +X, 


„= (1.5) 


FIGURE 7.5 

Euclid’s definition of the golden ratio is based on dividing a line into two segments so that 
the ratio of the whole line to the larger segment is equal to the ratio of the larger segment to 
the smaller segment. This ratio is called the golden ratio. 
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The final step in a bisection iteration involved determining a new smaller bracket. This 
was done by replacing whichever of the bounds x; or x, had a function value with the same 
sign as f(x,). A key advantage of this approach was that the new value x, replaced one of 
the old bounds. 

Now suppose that instead of a root, we were interested in determining the minimum 
of a one-dimensional function. As with bisection, we can start by defining an interval that 
contains a single answer. That is, the interval should contain a single minimum, and hence 
is called unimodal. We can adopt the same nomenclature as for bisection, where x, and x, 
defined the lower and upper bounds, respectively, of such an interval. However, in contrast 
to bisection, we need a new strategy for finding a minimum within the interval. Rather than 
using a single intermediate value (which is sufficient to detect a sign change, and hence 
a zero), we would need two intermediate function values to detect whether a minimum 
occurred. 

The key to making this approach efficient is the wise choice of the intermediate points. 
As in bisection, the goal is to minimize function evaluations by replacing old values with 
new values. For bisection, this was accomplished by choosing the midpoint. For the golden- 
section search, the two intermediate points are chosen according to the golden ratio: 


x, =x, +d (7.6) 

Xa = Xu = d (7.7) 
where 

d=(p—- I), - x) (7.8) 


The function is evaluated at these two interior points. Two results can occur: 


1. If, as in Fig. 7.6a, f (x,) < f (x), then f(x,) is the minimum, and the domain of x to the 
left of x,, from x, to x,, can be eliminated because it does not contain the minimum. For 
this case, x, becomes the new x, for the next round. 

2. Iff (a) <f(x,), then f(x) is the minimum and the domain of x to the right of x,, from 
x, to x, would be eliminated. For this case, x, becomes the new x, for the next round. 


Now, here is the real benefit from the use of the golden ratio. Because the original x, 
and x, were chosen using the golden ratio, we do not have to recalculate all the function 
values for the next iteration. For example, for the case illustrated in Fig. 7.6, the old x, be- 
comes the new x,. This means that we already have the value for the new f(x,), since it is 
the same as the function value at the old x,. 

To complete the algorithm, we need only determine the new x,. This is done with 
Eq. (7.6) with d computed with Eq. (7.8) based on the new values of x, and x,. A similar 
approach would be used for the alternate case where the optimum fell in the left subinterval. 
For this case, the new x, would be computed with Eq. (7.7). 

As the iterations are repeated, the interval containing the extremum is reduced 
rapidly. In fact, each round the interval is reduced by a factor of ġ — 1 (about 61.8%). 
That means that after 10 rounds, the interval is shrunk to about 0.618! or 0.008 or 0.8% 
of its initial length. After 20 rounds, it is about 0.0066%. This is not quite as good as 
the reduction achieved with bisection (50%), but optimization is a harder problem than 
root location. 
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F(x) 4 
Eliminate 


~+---— 


Minimum 


FO) 4 


FIGURE 7.6 

(a) The initial step of the golden-section search algorithm involves choosing two interior 
points according to the golden ratio. (b) The second step involves defining a new interval 
that encompasses the optimum. 


Golden-Section Search 


Problem Statement. Use the golden-section search to find the minimum of 
X si 
fM= T07 sin x 
within the interval from x, = 0 to x, = 4. 


Solution. First, the golden ratio is used to create the two interior points: 
d = 0.61803(4 — 0) = 2.4721 
x, = 0 + 2.4721 = 2.4721 
X, = 4 — 2.4721 = 1.5279 
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The function can be evaluated at the interior points: 
2 
fa) = 13272. — 2 sin(1.5279) = —1.7647 
2 
fap= 24 — 2 sin(2.4721) = —0.6300 
Because f (x,) < f (x,), our best estimate of the minimum at this point is that it is 
located at x = 1.5279 with a value of f (x) = —1.7647. In addition, we also know that the 
minimum is in the interval defined by x, x,, and x,. Thus, for the next iteration, the lower 
bound remains x, = 0, and x, becomes the upper bound, that is, x, = 2.4721. In addition, 
the former x, value becomes the new x, that is, x, = 1.5279. In addition, we do not have to 
recalculate f (x,), it was determined on the previous iteration as f (1.5279) = —-1.7647. 
All that remains is to use Eqs. (7.8) and (7.7) to compute the new value of d and x;: 
d = 0.61803(2.4721 — 0) = 1.5279 
X, = 2.4721 — 1.5279 = 0.9443 
The function evaluation at x, is f (0.9943) = —1.5310. Since this value is less than the 
function value at x,, the minimum is f (1.5279) = —1.7647, and it is in the interval pre- 
scribed by x, x,, and x,. The process can be repeated, with the results tabulated here: 
i x) f(x) X2 fœ) xı fœ Xu fœ) d 
1 0 0 5279 -1.7647 2.4721 -0.6300 4.0000 3.1136 2.4721 
2 0 0 0.9443 -1.5310 5279 -1.7647 2.4721 -0.6300 1.5279 
3 0.9443 —1.5310 S209 -1.7647 1.8885 -1.5432 2.4721 -0.6300 0.9443 
4 0.9443 -1.5310 1.3050 -1.7595 5279 -1.7647 1.8885 -1.5432 0.5836 
5 1.3050 -1.7595 135279 -1.7647 1.6656 —1.7136 1.8885 -1.5432 0.3607 
6 1.3050 -1.7595 1.4427 -1.7755 1.5279 -1.7647 1.6656 -1.7136 0.2229 
7 1.3050 -1.7595 1.3901 -1.7742 1.4427 -1.7755 1.5279 -1.7647 0.1378 
8 1.3901 -1.7742 1.4427 -1.7755 1.4752 -1.7732 1.5279 -1.7647 0.0851 


Note that the current minimum is highlighted for every iteration. After the eighth 
iteration, the minimum occurs at x = 1.4427 with a function value of —1.7755. Thus, the 
result is converging on the true value of —1.7757 at x = 1.4276. 


Recall that for bisection (Sec. 5.4), an exact upper bound for the error can be calculated 
at each iteration. Using similar reasoning, an upper bound for golden-section search can be 
derived as follows: Once an iteration is complete, the optimum will either fall in one of two 
intervals. If the optimum function value is at x, it will be in the lower interval (x, x», x,). If 
the optimum function value is at x,, it will be in the upper interval (x,, x,, x,,). Because the 
interior points are symmetrical, either case can be used to define the error. 
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Looking at the upper interval (x, x,, x,), if the true value were at the far left, the maxi- 
mum distance from the estimate would be 


Ax, =X) — Xy 


=X) + (h = Dx, ~ x) = Xu + ($ = Ia, ~~ x) 
= (x, — x,) + 2(@ — D(x, — x) 
= (26 = 3)(x, ae Xx) 


or 0.2361 (x, — x). If the true value were at the far right, the maximum distance from the 
estimate would be 


Ax, =X, — X; 
=x, —-x,- (@- D@, - x) 
= (x, = x) — ($ — DG, = xD 
= (2 — p)(x, — x) 


or 0.3820 (x, — x). Therefore, this case would represent the maximum error. This result can 


then be normalized to the optimal value for that iteration x,,, to yield 


Xx 

E= (2-4) a x 100% (7.9) 
opt 

This estimate provides a basis for terminating the iterations. 

An M-file function for the golden-section search for minimization is presented in 
Fig. 7.7. The function returns the location of the minimum, the value of the function, the 
approximate error, and the number of iterations. 

The M-file can be used to solve the problem from Example 7.1. 

>> g=9.81;v0=55;m=80;c=15;z0=100; 

>> z=@(t) -(z0+m/c*(vO+m*g/c)*(1-exp(-c/m*t) ) -m*g/c*t) ; 

>> [xmin, fmin,ea]=goldmin(z,0,8) 

xmin = 

3.8317 
fmin = 
-192.8609 

ea = 

6.9356e-005 


Notice how because this is a maximization, we have entered the negative of Eq. (7.1). 
Consequently, fmin corresponds to a maximum height of 192.8609. 

You may be wondering why we have stressed the reduced function evaluations of 
the golden-section search. Of course, for solving a single optimization, the speed savings 
would be negligible. However, there are two important contexts where minimizing the 
number of function evaluations can be important. These are 


1. Many evaluations. There are cases where the golden-section search algorithm may 
be a part of a much larger calculation. In such cases, it may be called many times. 
Therefore, keeping function evaluations to a minimum could pay great dividends for 
such cases. 
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function [x,fx,ea, iter]=goldmin(f,x1,xu,es,maxit,varargin) 
% goldmin: minimization golden section search 

% [x,fx,ea, iter ]=goldmin(f,x1,xu,es,maxit,p1,p2,...): 

% uses golden section search to find the minimum of f 
% input: 

% f= name of function 

% xl, xu = lower and upper guesses 

% es = desired relative error (default = 0.0001%) 

% maxit = maximum allowable iterations (default = 50) 

% pl,p2,... = additional parameters used by f 

% output: 

% x= location of minimum 

% xX = minimum function value 

% ea = approximate relative error (%) 

% iter = number of iterations 


if nargin<3,error('at least 3 input arguments required'),end 
if nargin<4|isempty(es), es=0.0001;end 

if nargin<5|isempty(maxit), maxit=50;end 

phi=(1+sqrt(5))/2; iter = 0; 

d = (phi-1)*(xu - x1); 

yal = xie d x2 SS ya) = Gale 
f1 = f(x1,varargin{:}); f2 


f(x2,varargin{:}); 


while(1) 
Xint = xü = xi; 
Tr il < 72 


xopt = x1; xl = x2; x2 = x1; f2 = f1; 
xl = x] + (phi-1)*(xu-x1); f1 = f(x1,varargin{:}) 
else 
xopt = x2; xu = x1; x1 = x2; f1 = f2; 
x2 = xu — (phi-1)*(xu-x1); f2 = f(x2,varargin{:}); 
end 
iter itenti, 
if xopt~=0, ea = (2 - phi) * abs(xint / xopt) * 100;end 
if ea <= es | iter >= maxit,break,end 
end 
x=xopt; fx=f(xopt,varargin{:}); 


FIGURE 7.7 
An M-file to determine the minimum of a function with the golden-section search. 


2. Time-consuming evaluation. For pedagogical reasons, we use simple functions in 
most of our examples. You should understand that a function can be very complex 
and time-consuming to evaluate. For example, optimization can be used to estimate 
the parameters of a model consisting of a system of differential equations. For such 
cases, the “function” involves time-consuming model integration. Any method that 
minimizes such evaluations would be advantageous. 
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EXAMPLE 7.3 


Parabolic 
approximation 
of maximum 


True maximum 


SO) + True function 


Parabolic 
function 


FIGURE 7.8 
Graphical depiction of parabolic interpolation. 


7.2.2 Parabolic Interpolation 


Parabolic interpolation takes advantage of the fact that a second-order polynomial often 
provides a good approximation to the shape of f(x) near an optimum (Fig. 7.8). 

Just as there is only one straight line connecting two points, there is only one parabola 
connecting three points. Thus, if we have three points that jointly bracket an optimum, we 
can fit a parabola to the points. Then we can differentiate it, set the result equal to zero, and 
solve for an estimate of the optimal x. It can be shown through some algebraic manipula- 
tions that the result is 


1 = 441)" LF Gy) -E — @ — 5) LF Oy) —F DI 
“ 2 (x) EFE Lf (%) = fœ) - = [ f (x) =f (p 


where x,, X, and x, are the initial guesses, and x, is the value of x that corresponds to the 
optimum value of the parabolic fit to the guesses. 


(7.10) 


Parabolic Interpolation 


Problem Statement. Use parabolic interpolation to approximate the minimum of 
fœ) = -2 sinx 
with initial guesses of x, = 0, x, = 1, and x, = 4. 


Solution. The function values at the three guesses can be evaluated: 
x,=0 f(x,) =0 
x,=1 F(x.) = —1.5829 
x, =4 Jap = 3.1136 
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and substituted into Eq. (7.10) to give 


1 d — 0)? [-1.5829 — 3.1136] — (1 — 4)? [-1.5829 — 0] 


-3 (= 0) [=1.5829 — 3.1136] - 0 — 4) [-1.5829 - 9) ~ 10> 


x,=1 


which has a function value of f(1.5055) = —1.7691. 

Next, a strategy similar to the golden-section search can be employed to determine 
which point should be discarded. Because the function value for the new point is lower 
than for the intermediate point (x,) and the new x value is to the right of the intermediate 
point, the lower guess (x,) is discarded. Therefore, for the next iteration: 


x,=1 f(a) = -1.5829 
x, = 1.5055 f(x) = -1.7691 
x,=4 f(x) = 3.1136 


which can be substituted into Eq. (7.10) to give 


1 (1.5055 — 1)’ [—1.7691 — 3.1136] — (1.5055 — 4} [-1.7691 — (—1.5829)] 
2 (1.5055 — 1) [-1.7691 — 3.1136] — (1.5055 — 4) [—1.7691 — (—1.5829)] 


x, = 1.5055 — 


= 1.4903 


which has a function value of f(1.4903) = —1.7714. The process can be repeated, with the 
results tabulated here: 


i xy fœ X fœ X3 fœ) X4 fay) 
1 0.0000 0.0000 1.0000 -1.5829 4.0000 3.1136 1.5055 -1.7691 
2 1.0000 -1.5829 1.5055 -1.7691 4.0000 3.1136 1.4903 -1.7714 
3 1.0000 -1.5829 1.4903 -1.7714 1.5055 -1.7691 1.4256 -1.7757 
4 1.0000 -1.5829 1.4256 -1.7757 1.4903 -1.7714 1.4266 -1.7757 
5 1.4256 -1.7757 1.4266 -1.7757 1.4903 -1.7714 1.4275 -1.7757 


Thus, within five iterations, the result is converging rapidly on the true value of —1.7757 
at x = 1.4276. 


7.2.3 MATLAB Function: fminbnd 


Recall that in Sec. 6.4 we described Brent’s method for root location, which combined sev- 
eral root-finding methods into a single algorithm that balanced reliability with efficiency. 
Because of these qualities, it forms the basis for the built-in MATLAB function fzero. 

Brent also developed a similar approach for one-dimensional minimization which 
forms the basis for the MATLAB fminbnd function. It combines the slow, dependable 
golden-section search with the faster, but possibly unreliable, parabolic interpolation. It 
first attempts parabolic interpolation and keeps applying it as long as acceptable results are 
obtained. If not, it uses the golden-section search to get matters in hand. 
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7.3 


A simple expression of its syntax is 
[xmin, fval] = fminbnd( function, x1, x2) 


where x and fval are the location and value of the minimum, function is the name 
of the function being evaluated, and x1 and x2 are the bounds of the interval being 
searched. 

Here is a simple MATLAB session that uses fminbnd to solve the problem from 
Example 7.1. 

>> g=9.81;v0=55;m=80;c=15;z0=100; 

>> z=@(t) -(z0+m/c*(vO+m*g/c)*(1-exp(-c/m*t) ) -m*g/c*t) ; 

>> [x,f]=fminbnd(z,0,8) 

X= 

3.8317 
f= 
-192.8609 


As with fzero, optional parameters can be specified using optimset. For example, we 
can display calculation details: 


>> options = optimset('display','iter'); 
>> fminbnd (z,0,8,options) 


Func-count X f(x) Procedure 
1 3.05573 -189.759 initial 
2 4.94427 -187.19 golden 
3 1.88854 -171.871 golden 
4 3.87544 -192.851 parabolic 
5 3.85836 -192.857 parabolic 
6 3.83332 -192.861 parabolic 
7 3.83162 -192.861 parabolic 
8 3.83166 -192.861 parabolic 
9 3.83169 -192.861 parabolic 


Optimization terminated: 
the current x satisfies the termination criteria using 
OPTIONS. To1X of 1.000000e-004 


ans = 
3.8317 


Thus, after three iterations, the method switches from golden to parabolic, and after eight 
iterations, the minimum is determined to a tolerance of 0.0001. 


MULTIDIMENSIONAL OPTIMIZATION 


Aside from one-dimensional functions, optimization also deals with multidimensional 
functions. Recall from Fig. 7.3a that our visual image of a one-dimensional search was 
like a roller coaster. For two-dimensional cases, the image becomes that of mountains and 
valleys (Fig. 7.3b). As in the following example, MATLAB’s graphic capabilities provide 
a handy means to visualize such functions. 
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EXAMPLE 7.4 


Visualizing a Two-Dimensional Function 


Problem Statement. Use MATLAB’s graphical capabilities to display the following 
function and visually estimate its minimum in the range —-2 < x, < 0 and 0 <x, <3: 


ft) =2+%,— x, 4+ 2x + 2x% +x? 


Solution. The following script generates contour and mesh plots of the function: 


x= ]inspace(-2,0,40);y=linspace(0,3,40); 
[X,Y] = meshgrid(x,y); 
Z=2+X-Y+2*X.^2+2*X.*Y+Y.^2; 
subplot(1,2,1); 
cs=contour(X,Y,Z);clabel(cs); 
xlabel('x_1');ylabel('x_2'); 

title('(a) Contour plot');grid; 
subplot(1,2,2); 

cs=surfc(X,Y,Z); 

zmin=floor(min(Z) ); 

zmax=ceil(max(Z)); 
xlabel('x_1');ylabel('x_2');zlabel('f(x_1,x_2)'); 
title('(b) Mesh plot'); 


As displayed in Fig. 7.9, both plots indicate that function has a minimum value of about 
F(X, X2) = 0 to 1 located at about x, = —1 and x, = 1.5. 


FIGURE 7.9 
(a) Contour and (b) mesh plots of a two-dimensional function. 


(a) Contour plot 


(b) Mesh plot 
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Techniques for multidimensional unconstrained optimization can be classified in a 
number of ways. For purposes of the present discussion, we will divide them depending 
on whether they require derivative evaluation. Those that require derivatives are called gra- 
dient, or descent (or ascent), methods. The approaches that do not require derivative evalu- 
ation are called nongradient, or direct, methods. As described next, the built-in MATLAB 
function fminsearch is a direct method. 


7.3.1 MATLAB Function: fminsearch 


Standard MATLAB has a function fminsearch that can be used to determine the mini- 
mum of a multidimensional function. It is based on the Nelder-Mead method, which is 
a direct-search method that uses only function values (does not require derivatives) and 
handles non-smooth objective functions. A simple expression of its syntax is 


[xmin, fval] = fminsearch (function, x0) 


where xmin and fval are the location and value of the minimum, functionis the name of the func- 
tion being evaluated, and x0 is the initial guess. Note that x0 can be a scalar, vector, or a matrix. 

Here is a simple MATLAB session that uses fminsearch to determine minimum for the 
function we just graphed in Example 7.4: 

>> F=@(x) 2+x(1)-x (2) +2*x (1)424+2*x (1)*x (2) +x (2) 42; 

>> [x, fval]=fminsearch(f,[-0.5,0.5]) 

X= 

-1.0000 1.5000 
fval = 
0.7500 


yA NND EQUILIBRIUM AND MINIMUM POTENTIAL ENERGY 


Background. As in Fig. 7.10a, an unloaded spring can be attached to a wall mount. When 
a horizontal force is applied, the spring stretches. The displacement is related to the force 
by Hookes law, F = kx. The potential energy of the deformed state consists of the differ- 


ence between the strain energy of the spring and the work done by the force: ait 
PE(x) = 0.5kx? — Fx 


FIGURE 7.10 
(a) An unloaded spring attached to a wall mount. (6) Application of a horizontal force stretches 
the spring where the relationship between force and displacement is described by Hooke’s law. 


a 
(b) 
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7.4 CASE STUDY continued 


Equation (7.11) defines a parabola. Since the potential energy will be at a minimum 
at equilibrium, the solution for displacement can be viewed as a one-dimensional optimi- 
zation problem. Because this equation is so easy to differentiate, we can solve for the dis- 
placement as x = F/k. For example, if k = 2 N/cm and F = 5 N, x = 5N/(2 N/cm) = 
2.5.cm. 

A more interesting two-dimensional case is shown in Fig. 7.11. In this system, there 
are two degrees of freedom in that the system can move both horizontally and vertically. 
In the same way that we approached the one-dimensional system, the equilibrium deforma- 
tions are the values of x, and x, that minimize the potential energy: 


PE(x,, %2) = 0.5k, (Vr + (Ly - x) - ra 
2 
+ 0.5k, (vx + (L, +x) — i) — Fx, — Fx, (7.12) 


If the parameters are k, = 9 N/cm, k, = 2 N/cm, L, = 10 cm, L, = 10 cm, F, = 2 N, and 
F, =4N, use MATLAB to solve for the displacements and the potential energy. 


FIGURE 7.11 
A two-spring system: (a) unloaded and (b) loaded. 


(a) (b) 


PROBLEMS 
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Solution. An M-file can be developed to hold the potential energy function: 
function p=PE(x, ka, kb, La, Lb, F1, F2) 
PEa=0.5*ka*(sqrt(x(1)42+ (La-x(2))42) -La)42; 
PEb=0.5*kb*(sqrt(x(1)42+(Lb+x(2))42) -Lb)42; 


W=F1*x(1) +F2*x(2); 
p=PEa+PEb-W; 


The solution can be obtained with the fminsearch function: 


>> ka=9;kb=2;La=10;Lb=10;F1=2;F2=4; 
>> [x,f]=fminsearch(@(x) PE(x,ka,kb,La,Lb,F1,F2),[-0.5,0.5]) 


“eS 

4.9523 172769 
mS 
-9.6422 


Thus, at equilibrium, the potential energy is —9.6422 N-cm. The connecting point is 
located 4.9523 cm to the right and 1.2759 cm above its original position. 


PROBLEMS 


7.1 Perform three iterations of the Newton-Raphson method 
to determine the root of Eq. (E7.1.1). Use the parameter val- 
ues from Example 7.1 along with an initial guess of t = 3 s. 
7.2 Given the formula 


f@=- + 8x- 12 


(a) Determine the maximum and the corresponding value of 
x for this function analytically (i.e., using differentiation). 

(b) Verify that Eq. (7.10) yields the same results based on 
initial guesses of x, = 0, x, = 2, and x, = 6. 

7.3 Consider the following function: 


f(x) = 3 + 6x t+ 5 + 3x7 + 4x4 


Locate the minimum by finding the root of the derivative of 
this function. Use bisection with initial guesses of x, = —2 
and x, = 1. 
7.4 Given 


f(x) = -1.5x° — 2x4 + 12x 


(a) Plot the function. 
(b) Use analytical methods to prove that the function is con- 
cave for all values of x. 


(c) Differentiate the function and then use a root-location 
method to solve for the maximum f(x) and the corre- 
sponding value of x. 

7.5 Solve for the value of x that maximizes f(x) in Prob. 7.4 

using the golden-section search. Employ initial guesses of 

x, = 0 and x, = 2, and perform three iterations. 

7.6 Repeat Prob. 7.5, except use parabolic interpolation. 

Employ initial guesses of x, = 0, x, = 1, and x, = 2, and 

perform three iterations. 

7.7 Employ the following methods to find the maximum of 


f(x) = 4x — 1.8x7 + 1.27 — 0.324 


(a) Golden-section search (x, = —2, x, = 4, €, = 1%). 

(b) Parabolic interpolation (x, = 1.75, x, = 2, x; = 2.5, 
iterations = 5). 

7.8 Consider the following function: 


f(x) = x7 + 2x7 + 8x? + 5x 


Use analytical and graphical methods to show the func- 
tion has a minimum for some value of x in the range 
—2<x<l. 
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7.9 Employ the following methods to find the minimum of 
the function from Prob. 7.8: 
(a) Golden-section search (x, = —2, x, = 1, €, = 1%). 
(b) Parabolic interpolation (x, = —2, x, = —1, x, = 1, 
iterations = 5). 
7.10 Consider the following function: 
fx) = 2x43 
Perform 10 iterations of parabolic interpolation to locate 
the minimum. Comment on the convergence of your results 
(x, = 0.1, x, = 0.5, x, = 5) 
7.11 The following function defines a curve with several 
unequal minima over the interval: 2 < x < 20, 


fœ) = sin(x) + sin (3 x) 


Develop a MATLAB script to (a) plot the function over 
the interval. Determine the minimum (b) with fminbnd and 
(c) by hand with golden-section search with a stopping cri- 
terion corresponding to three significant figures. For (b) and 
(c), use initial guesses of [4, 8]. 

7.12 Use the golden-section search to determine the loca- 
tion, Xmas and maximum, f(x,,,,), of the following function 
by hand, 


max. 


f (x) = —0.8x7 + 2.2x7 + 0.6 


Use initial guesses of x, = 0.7 and x, = 1.4 and perform suffi- 
cient iterations so that £, = 10%. Determine the approximate 
relative error of your final result. 

7.13 Develop a single script to (a) generate contour and 
mesh subplots of the following temperature field in a similar 
fashion to Example 7.4: 


T (x, y) = 2x? + 3y* — 4xy — y — 3x 


and (b) determine the minimum with fminsearch. 
7.14 The head of a groundwater aquifer is described in Car- 
tesian coordinates by 


1 


h(x, y) = ae) 
l+x+ytx+xy 


Develop a single script to (a) generate contour and mesh 
subplots of the function in a similar fashion to Example 7.4, 
and (b) determine the maximum with fminsearch. 

7.15 Recent interest in competitive and recreational cycling 
has meant that engineers have directed their skills toward 
the design and testing of mountain bikes (Fig. P7.15a). Sup- 
pose that you are given the task of predicting the horizontal 


FIGURE P7.15 
(a) A mountain bike along with (b) a free-body diagram 
for a part of the frame. 


and vertical displacement of a bike bracketing system in 
response to a force. Assume the forces you must analyze 
can be simplified as depicted in Fig. P7.15b. You are inter- 
ested in testing the response of the truss to a force exerted 
in any number of directions designated by the angle 8. The 
parameters for the problem are E = Young’s modulus = 
2x 10!! Pa, A = cross-sectional area = 0.0001 m°, w = width = 
0.44 m, @ = length = 0.56 m, and h = height = 0.5 m. The 
displacements x and y can be solved by determining the val- 
ues that yield a minimum potential energy. Determine the 
displacements for a force of 10,000 N and a range of 6’s 
from 0° (horizontal) to 90° (vertical). 

7.16 As electric current moves through a wire (Fig. P7.16), 
heat generated by resistance is conducted through a layer of 
insulation and then convected to the surrounding air. The 
steady-state temperature of the wire can be computed as 


1 Potti 1 1 
in( r ) h ratr; 


w 


FIGURE P7.16 
Cross-section of an insulated wire. 
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Determine the thickness of insulation r,(m) that minimizes 
the wire’s temperature given the following parameters: g = 
heat generation rate = 75 W/m, r„ = wire radius = 6 mm, 
k = thermal conductivity of insulation = 0.17 W/(m K), 
h = convective heat transfer coefficient = 12 W/(m? K), and 
Tir = air temperature = 293 K. 

7.17 Develop an M-file that is expressly designed to locate 
a maximum with the golden-section search. In other words, 
set it up so that it directly finds the maximum rather than 
finding the minimum of —f(x). The function should have the 
following features: 


e Jterate until the relative error falls below a stopping cri- 
terion or exceeds a maximum number of iterations. 
e Return both the optimal x and f(x). 


Test your program with the same problem as Example 7.1. 
7.18 Develop an M-file to locate a minimum with the 
golden-section search. Rather than using the maximum it- 
erations and Eq. (7.9) as the stopping criteria, determine the 
number of iterations needed to attain a desired tolerance. 
Test your function by solving Example 7.2 using E,, = 
0.0001. 

7.19 Develop an M-file to implement parabolic interpo- 
lation to locate a minimum. The function should have the 
following features: 


e Base it on two initial guesses, and have the program 
generate the third initial value at the midpoint of the 
interval. 

e Check whether the guesses bracket a maximum. If not, 
the function should not implement the algorithm, but 
should return an error message. 

e Iterate until the relative error falls below a stopping cri- 
terion or exceeds a maximum number of iterations. 

e Return both the optimal x and f(x). 


Test your program with the same problem as Example 7.3. 
7.20 Pressure measurements are taken at certain points 
behind an airfoil over time. These data best fit the curve 
y = 6 cos x — 1.5 sin x from x = 0 to 6s. Use four iterations 
of the golden-search method to find the minimum pressure. 
Set x, = 2 and x, = 4. 

7.21 The trajectory of a ball can be computed with 


y = (tan O))x — a e+ Yo 


PR 
205 cos” Oo 


where y = the height (m), 0) = the initial angle (radians), 
Vo = the initial velocity (m/s), g = the gravitational 
constant = 9.81 m/s’, and yọ = the initial height (m). Use 
the golden-section search to determine the maximum height 


given yọ = 2 m, Vo = 20 m/s, and @ = 45°. Iterate until the 
approximate error falls below £, = 10% using initial guesses 
of x, = 10 and x, = 30 m. 

7.22 The deflection of a uniform beam subject to a linearly 
increasing distributed load can be computed as 


=i 
120ETL 


Given that L = 600 cm, E = 50,000 kN/cm?, J = 30,000 cm’, 
and wọ = 2.5 kN/cm, determine the point of maximum de- 
flection (a) graphically, (b) using the golden-section search 
until the approximate error falls below £, = 1% with initial 
guesses of x, = 0 and x, = L. 

7.23 A object with a mass of 90 kg is projected upward 
from the surface of the earth at a velocity of 60 m/s. If the 
object is subject to linear drag (c = 15 kg/s), use the golden- 
section search to determine the maximum height the object 
attains. 

7.24 The normal distribution is a bell-shaped curve defined by 


y (-x° + 217x3 — L*x) 


y=” 
Use the golden-section search to determine the location of 
the inflection point of this curve for positive x. 
7.25 Use the fminsearch function to determine the 
minimum of 


fœ, y) = 2y — 2.25xy — 1.75y + 1.5x° 


7.26 Use the fminsearch function to determine the 
maximum of 


f y) = 4x + 2y +r — 2x4 + 2xy — 3y? 
7.27 Given the following function: 
f(x, y) = -8x +x + 12y + 4y — 2xy 


Determine the minimum (a) graphically, (b) numerically 
with the fminsearch function, and (c) substitute the result 
of (b) back into the function to determine the minimum 
fa, y). 

7.28 The specific growth rate of a yeast that produces an 
antibiotic is a function of the food concentration c: 


= 2c 
44 0.8c + e +0.28 


& 


As depicted in Fig. P7.28, growth goes to zero at very low 
concentrations due to food limitation. It also goes to zero at 
high concentrations due to toxicity effects. Find the value of 
c at which growth is a maximum. 
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The specific growth rate of a yeast that produces an 
antibiotic versus the food concentration. 


7.29 A compound A will be converted into B in a stirred tank 
reactor. The product B and unreacted A are purified in a sepa- 
ration unit. Unreacted A is recycled to the reactor. A process 
engineer has found that the initial cost of the system is a func- 
tion of the conversion x,. Find the conversion that will result 
in the lowest cost system. C is a proportionality constant. 


1 0.6 1 0.6 
(a - a "e 


7.30 A finite-element model of a cantilever beam subject 
to loading and moments (Fig. P7.30) is given by optimizing 


Cost = C 


fŒ, y) = 5x — Sxy + 2.5y* — x — 1.5y 


where x = end displacement and y = end moment. Find the 
values of x and y that minimize f(x, y). 

7.31 The Streeter-Phelps model can be used to compute the 
dissolved oxygen concentration in a river below a point dis- 
charge of sewage (Fig. P7.31), 


= ka L, =k t —(k +k )t 
a pph p Tee 
¢ (P7.31) 
aed 2 a = e% ) 
x 
, 
FIGURE P7.30 


A cantilever beam. 


A dissolved oxygen “sag” below a point discharge of 
sewage into a river. 


where o = dissolved oxygen concentration (mg/L), o, = 
oxygen saturation concentration (mg/L), t = travel time (d), 
L, = biochemical oxygen demand (BOD) concentration 
at the mixing point (mg/L), k, = rate of decomposition of 
BOD (d°}), k, = rate of settling of BOD (d-}), k, = reaera- 
tion rate (d~'), and S$, = sediment oxygen demand (mg/L/d). 

As indicated in Fig. P7.31, Eq. (P7.31) produces an 
oxygen “sag” that reaches a critical minimum level 0,, some 
travel time t, below the point discharge. This point is called 
“critical” because it represents the location where biota that 
depend on oxygen (like fish) would be the most stressed. 
Develop a MATLAB script that (a) generates a plots of the 
function versus travel time and (b) uses fminbnd to deter- 
mine the critical travel time and concentration, given the fol- 
lowing values: 


o, = 10 mg/L 
k, = 0.05 d! 


k,=01d0! 
L, = 50 mg/L 


k,=0.6d7! 
S, = 1 mg/L/d 


7.32 The two-dimensional distribution of pollutant concen- 
tration in a channel can be described by 


c(x, y) = 7.9 + 0.13x + 0.21y — 0.05x7 
—0.016y° — 0.007xy 


Determine the exact location of the peak concentration given 
the function and the knowledge that the peak lies within the 
bounds —10 <x < 10 and 0 < y < 20. 

7.33 A total charge Q is uniformly distributed around a ring- 
shaped conductor with radius a. A charge q is located at a 
distance x from the center of the ring (Fig. P7.33). The force 
exerted on the charge by the ring is given by 


-1 q% 
Amey (2 + a 
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where ep = 8.85 x 107°? C'N m’), q = Q =2 x 10°C, 
and a = 0.9 m. Determine the distance x where the force is 
a maximum. 

7.34 The torque transmitted to an induction motor is a func- 
tion of the slip between the rotation of the stator field and the 
rotor speed s, where slip is defined as 


par 
~ nA 


where n = revolutions per second of rotating stator speed 
and np = rotor speed. Kirchhoff’ s laws can be used to show 
that the torque (expressed in dimensionless form) and slip 
are related by 


= 15s(1 — s) 
(1 = s)(4s* — 3s + 4) 


Figure P7.34 shows this function. Use a numerical method 
to determine the slip at which the maximum torque occurs. 
7.35 The total drag on an airfoil can be estimated by 


>, 0.95 (WY? 
D=0.010V? +22 (W) 
Friction Lift 


where D = drag, o = ratio of air density between the flight 
altitude and sea level, W = weight, and V = velocity. As seen 
in Fig. P7.35, the two factors contributing to drag are affected 


FIGURE P7.34 
Torque transmitted to an inductor as a function of slip. 


FIGURE P7.35 
Plot of drag versus velocity for an airfoil. 


differently as velocity increases. Whereas friction drag in- 

creases with velocity, the drag due to lift decreases. The com- 

bination of the two factors leads to a minimum drag. 

(a) If o = 0.6 and W = 16,000, determine the minimum drag 
and the velocity at which it occurs. 

(b) In addition, develop a sensitivity analysis to determine 
how this optimum varies in response to a range of 
W = 12,000 to 20,000 with o = 0.6. 

7.36 Roller bearings are subject to fatigue failure caused by 

large contact loads F (Fig. P7.36). The problem of finding 

the location of the maximum stress along the x axis can be 

shown to be equivalent to maximizing the function: 


fost -VIr (1-98) 4 


14+2x +x 


Find the x that maximizes f(x). 


|- 


FIGURE P7.36 
Roller bearings. 
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FIGURE P7.37 
Two frictionless masses connected to a wall by a pair of 
linear elastic springs. 


7.37 In a similar fashion to the case study described in 
Sec. 7.4, develop the potential energy function for the sys- 
tem depicted in Fig. P7.37. Develop contour and surface 
plots in MATLAB. Minimize the potential energy function 
to determine the equilibrium displacements x, and x, given 
the forcing function F = 100 N and the parameters k, = 20 
and k, = 15 N/m. 

7.38 As an agricultural engineer, you must design a trap- 
ezoidal open channel to carry irrigation water (Fig. P7.38). 
Determine the optimal dimensions to minimize the wetted 
perimeter for a cross-sectional area of 50 m’. Are the relative 
dimensions universal? 

7.39 Use the function fminsearch to determine the length 
of the shortest ladder that reaches from the ground over the 
fence to the building’s wall (Fig. P7.39). Test it for the case 
where h = d = 4 m. 

7.40 The length of the longest ladder that can negoti- 
ate the corner depicted in Fig. P7.40 can be determined 
by computing the value of @ that minimizes the following 
function: 


o= a 
~ sin  sin(z-— a -— 0) 
= w >i 


FIGURE P7.38 


FIGURE P7.39 
A ladder leaning against a fence and just touching 
a wall. 


For the case where w, = w, = 2 m, use a numerical method 
described in this chapter (including MATLAB’s built-in 
capabilities) to develop a plot of L versus a range of a’s from 
45 to 135°. 

7.41 Figure P7.41 shows a pinned-fixed beam subject to a 
uniform load. The equation for the resulting deflections is 


U (2x4 31x + Lx) 


Y=- Q8EI 


FIGURE P7.40 
A ladder negotiating a corner formed by two hallways. 
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FIGURE P7.41 


Develop a MATLAB script that uses fminbnd to (a) generate 
a labeled plot of deflection versus distance and (b) deter- 
mine the location and magnitude of the maximum deflec- 
tion. Employ an initial guesses of 0 and L and use optimset 
to display the iterations. Use the following parameter values 
in your computation (making sure that you use consistent 
units): L = 400 cm, E = 52,000 kN/cm?, J = 32,000 cmt, and 
w = 4 kN/cm. 

7.42 For a jet in steady, level flight, thrust balances drag 
and lift balances weight (Fig. P7.42). Under these condi- 
tions, the optimal cruise speed occurs when the ratio of drag 
force to velocity is minimized. The drag, Cp, can be com- 
puted as 

2 


Ci 
Cp = Coo + TAR 
where Cpo = drag coefficient at zero lift, C, = the lift coef- 
ficient, and AR = the aspect ratio. For steady level flight, the 
lift coefficient can be computed as 


_ _2W 
pv A 


L 


where W = the jet’s weight (N), p = air density (kg/m’°), v = 
velocity (m/s), and A = wing planform area (m°). The drag 
force can then be computed as 


P =w 
Lift 
= 
Weight (gravity) 
FIGURE P7.42 


The four major forces on a jet in steady, level flight. 


Use these formulas to determine the optimal steady cruise 
velocity for a 670 KN jet flying at 10 km above sea level. 
Employ the following parameters in your computation: A = 
150 m’, AR = 6.5, Cp = 0.018, and p = 0.413 kg/m?. 

7.43 Develop a MATLAB script to generate a plot of the 
optimal velocity of the jet from Prob. 7.42 versus elevation 
above sea level. Employ a mass of 68,300 kg for the jet. 
Note that the gravitational acceleration at 45° latitude can be 
computed as a function of elevation with 


r Y 
g(h) = 9.8066 P s i| 


where g(h) = gravitational acceleration (m/s”) at elevation 
h (m) above sea level, and r, = Earth’s mean radius (= 6.371 X 
10° m). In addition, air density as a function of elevation can 
be calculated with 


pth) = —9.57926 x 1074h3 + 4.71260 x 10-°h? 
— 1.18951 x 10~4h + 1.22534 


Employ the other parameters from Prob. 7.42 and design the 
plot for elevations ranging from h = 0 to 12 km above sea 
level. 

7.44 As depicted in Fig. P7.44, a mobile fire hose projects a 
stream of water onto the roof of a building. At what angle, 0, 
and how far from the building, x,, should the hose be placed 
in order to maximize the coverage of the roof; that is, to 
maximize: x, — x,? Note that the water velocity leaving the 
nozzle has a constant value of 3 m/s regardless of the angle, 
and the other parameter values are h, = 0.06 m, h, = 0.2 m, 
and L = 0.12 m. [Hint: The coverage is maximized for the 
trajectory that just clears the top front corner. That is, we 
want to choose an x, and @ that just clear the top corner while 
maximizing x, —x,.] 
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FIGURE P7.44 
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7.45 Since many pollutants enter lakes (and other water- 
bodies for that matter) at their peripheries, an important 
water-quality problem involves modeling the distribution 
of contaminants in the vicinity of a waste discharge or a 
river. For a vertically well-mixed, constant-depth layer, the 
steady-state distribution of a pollutant reacting with first- 
order decay is represented by 


Open boundary 


Ç 


Solid bounda = 


Solid boundary W Open boundary 7 
FIGURE P7.45 

Plan view of a section of a lake with a point source of 
pollutant entering at the middle of the lower boundary. 


We We = kc 


oc 
0=-U,=+E 
ax’ ay” 


~“ Ox 


where the x and y axes are defined to be parallel and per- 
pendicular to the shoreline, respectively (Fig. P7.45). The 
parameters and variables are: U, = the water velocity along 
the shoreline (m/d), c = concentration, E = the turbulent 
diffusion coefficient, and k = the first-order decay rate. For 
the case where a constant loading, W, enters at (0, 0), the 
solution for the concentration at any coordinate is given by 


c=2 ec y)+ È [e(x, y + 2nY) - c(x, y + 2nY)] 


n=1 


where 


Uxx UX 


where Y = the width, H = depth, and K, = the modified Bes- 
sel function of the second kind. Develop a MATLAB script 
to generate a contour plot of concentration for a section of a 
lake with Y = 4.8 km and a length from X = -2.4 to 2.4 km 
using Ax = Ay = 0.32 km. Employ the following parameters 
for your calculation: W = 1.2 x 10°, H = 20, E = 5 x 10°, 
U,=5x10°,k=1, andn=3. 
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What Are Linear Algebraic Equations? 


In Part Two, we determined the value x that satisfied a single equation, f(x) = 0. Now, we 
deal with the case of determining the values x,, x5, ..., x, that simultaneously satisfy a set 
of equations: 


Ai@i; Xo, ---,%,) = 0 
AG %,...,%,) =0 


SF. G45 Hoy 202 5: X,) = 0 


Such systems are either linear or nonlinear. In Part Three, we deal with linear algebraic 
equations that are of the general form 


A,X, Fag +++ +4,,x, = 5, 


Ay, X, + Ay X, ++ ++ +45, X, ba 73.1) 


a, X + Ay X3 +++ + +4,,X,) = b, 


where the a’s are constant coefficients, the b’s 
are constants, the x’s are unknowns, and n is the 
number of equations. All other algebraic equa- 
tions are nonlinear. 


Linear Algebraic Equations in 
Engineering and Science 


Many of the fundamental equations of engineering 
and science are based on conservation laws. Some 
familiar quantities that conform to such laws are 
mass, energy, and momentum. In mathematical 
terms, these principles lead to balance or continuity 
equations that relate system behavior as represented 
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FIGURE PT3.1 

Two types of systems that can be modeled using linear algebraic equations: (a) lumped 
variable system that involves coupled finite components and (b) distributed variable system 
that involves a continuum. 


by the levels or response of the quantity being modeled to the properties or characteristics of the 
system and the external stimuli or forcing functions acting on the system. 

As an example, the principle of mass conservation can be used to formulate a model 
for a series of chemical reactors (Fig. PT3.1a). For this case, the quantity being modeled is 
the mass of the chemical in each reactor. The system properties are the reaction character- 
istics of the chemical and the reactors’ sizes and flow rates. The forcing functions are the 
feed rates of the chemical into the system. 

When we studied roots of equations, you saw how single-component systems result in a 
single equation that can be solved using root-location techniques. Multicomponent systems 
result in a coupled set of mathematical equations that must be solved simultaneously. The 
equations are coupled because the individual parts of the system are influenced by other 
parts. For example, in Fig. PT3.1a, reactor 4 receives chemical inputs from reactors 2 and 3. 
Consequently, its response is dependent on the quantity of chemical in these other reactors. 

When these dependencies are expressed mathematically, the resulting equations are 
often of the linear algebraic form of Eq. (PT3.1). The x’s are usually measures of the mag- 
nitudes of the responses of the individual components. Using Fig. PT3.1a as an example, 
x, might quantify the amount of chemical mass in the first reactor, x, might quantify the 
amount in the second, and so forth. The a’s typically represent the properties and character- 
istics that bear on the interactions between components. For instance, the a’s for Fig. PT3.1la 
might be reflective of the flow rates of mass between the reactors. Finally, the b’s usually 
represent the forcing functions acting on the system, such as the feed rate. 

Multicomponent problems of these types arise from both lumped (macro-) or distrib- 
uted (micro-) variable mathematical models. Lumped variable problems involve coupled 
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finite components. The three interconnected bungee jumpers described at the beginning 
of Chap. 8 are a lumped system. Other examples include trusses, reactors, and electric 
circuits. 

Conversely, distributed variable problems attempt to describe the spatial detail on a 
continuous or semicontinuous basis. The distribution of chemicals along the length of an 
elongated, rectangular reactor (Fig. PT3.1b) is an example of a continuous variable model. 
Differential equations derived from conservation laws specify the distribution of the depen- 
dent variable for such systems. These differential equations can be solved numerically by 
converting them to an equivalent system of simultaneous algebraic equations. 

The solution of such sets of equations represents a major application area for the meth- 
ods in the following chapters. These equations are coupled because the variables at one lo- 
cation are dependent on the variables in adjoining regions. For example, the concentration 
at the middle of the reactor in Fig. PT3.1b is a function of the concentration in adjoining 
regions. Similar examples could be developed for the spatial distribution of temperature, 
momentum, or electricity. 

Aside from physical systems, simultaneous linear algebraic equations also arise in a 
variety of mathematical problem contexts. These result when mathematical functions are 
required to satisfy several conditions simultaneously. Each condition results in an equation 
that contains known coefficients and unknown variables. The techniques discussed in this 
part can be used to solve for the unknowns when the equations are linear and algebraic. 
Some widely used numerical techniques that employ simultaneous equations are regres- 
sion analysis and spline interpolation. 


PART ORGANIZATION 


Due to its importance in formulating and solving linear algebraic equations, Chap. 8 
provides a brief overview of matrix algebra. Aside from covering the rudiments of matrix 
representation and manipulation, the chapter also describes how matrices are handled in 
MATLAB. 

Chapter 9 is devoted to the most fundamental technique for solving linear algebraic 
systems: Gauss elimination. Before launching into a detailed discussion of this technique, a 
preliminary section deals with simple methods for solving small systems. These approaches 
are presented to provide you with visual insight and because one of the methods—the elim- 
ination of unknowns—tepresents the basis for Gauss elimination. 

After this preliminary material, “naive” Gauss elimination is discussed. We start with 
this “stripped-down” version because it allows the fundamental technique to be elabo- 
rated on without complicating details. Then, in subsequent sections, we discuss potential 
problems of the naive approach and present a number of modifications to minimize and 
circumvent these problems. The focus of this discussion will be the process of switching 
rows, or partial pivoting. The chapter ends with a brief description of efficient methods for 
solving tridiagonal matrices. 

Chapter 10 illustrates how Gauss elimination can be formulated as an LU factoriza- 
tion. Such solution techniques are valuable for cases where many right-hand-side vectors 
need to be evaluated. The chapter ends with a brief outline of how MATLAB solves linear 
systems. 
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Chapter 11 starts with a description of how LU factorization can be employed to ef- 
ficiently calculate the matrix inverse, which has tremendous utility in analyzing stimulus- 
response relationships of physical systems. The remainder of the chapter is devoted to the 
important concept of matrix condition. The condition number is introduced as a measure of 
the roundoff errors that can result when solving ill-conditioned matrices. 

Chapter 12 deals with iterative solution techniques, which are similar in spirit to the 
approximate methods for roots of equations discussed in Chap. 6. That is, they involve guess- 
ing a solution and then iterating to obtain a refined estimate. The emphasis is on the Gauss- 
Seidel method, although a description is provided of an alternative approach, the Jacobi 
method. The chapter ends with a brief description of how nonlinear simultaneous equations 
can be solved. 

Finally, Chap. 13 is devoted to eigenvalue problems. These have general mathematical 
relevance as well as many applications in engineering and science. We describe two simple 
methods as well as MATLAB’s capabilities for determining eigenvalues and eigenvectors. 
In terms of applications, we focus on their use to study the vibrations and oscillations of 
mechanical systems and structures. 


Linear Algebraic Equations 
and Matrices 


CHAPTER OBJECTIVES 


The primary objective of this chapter is to acquaint you with linear algebraic equations 
and their relationship to matrices and matrix algebra. Specific objectives and topics 
covered are 


e Understanding matrix notation. 
Being able to identify the following types of matrices: identity, diagonal, 


symmetric, triangular, and tridiagonal. 

Knowing how to perform matrix multiplication and being able to assess when it is 
feasible. 

Knowing how to represent a system of linear algebraic equations in matrix 
form. 

Knowing how to solve linear algebraic equations with left division and matrix 
inversion in MATLAB. 


YOU’VE GOT A PROBLEM 


uppose that three jumpers are connected by bungee cords. Figure 8.1a shows them 
Se held in place vertically so that each cord is fully extended but unstretched. 
We can define three distances, x,, x,, and x3, as measured downward from each of 
their unstretched positions. After they are released, gravity takes hold and the jumpers will 
eventually come to the equilibrium positions shown in Fig. 8.1b. 
Suppose that you are asked to compute the displacement of each of the jumpers. If 
we assume that each cord behaves as a linear spring and follows Hooke’s law, free-body 
diagrams can be developed for each jumper as depicted in Fig. 8.2. 
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(a) Unstretched (b) Stretched M8 k(x =x) mg k- Xp) m3g 
FIGURE 8.1 FIGURE 8.2 


Three individuals connected by bungee cords. Free-body diagrams. 


Using Newton’s second law, force balances can be written for each jumper: 
2 
x1 
m — =m, 8 + k(x) — x) — kx 
dt 
2 
x2 
"2 dt? = mg + kax; — Xp) + k(x — Xp) 
dx, 
mM pa = M38 + k3(x. — x3) 


(8.1) 


where m, = the mass of jumper i (kg), t = time (s), k, = the spring constant for cord j (N/m), 
x; = the displacement of jumper i measured downward from the equilibrium position (m), 
and g = gravitational acceleration (9.81 m/s’). Because we are interested in the steady-state 


solution, the second derivatives can be set to zero. Collecting terms gives 
(k, + ky) x; =K = mE 
—k,x, + (k, + ky) x, — kyx, = mg 


—k,x, + k;x, = m3g 


(8.2) 


Thus, the problem reduces to solving a system of three simultaneous equations for 
the three unknown displacements. Because we have used a linear law for the cords, these 
equations are linear algebraic equations. Chapters 8 through 12 will introduce you to how 


MATLAB is used to solve such systems of equations. 
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8.1 


MATRIX ALGEBRA OVERVIEW 


Knowledge of matrices is essential for understanding the solution of linear algebraic equa- 
tions. The following sections outline how matrices provide a concise way to represent and 
manipulate linear algebraic equations. 


8.1.1 Matrix Notation 


A matrix consists of a rectangular array of elements represented by a single symbol. As 
depicted in Fig. 8.3, [A] is the shorthand notation for the matrix and a,; designates an indi- 
vidual element of the matrix. 

A horizontal set of elements is called a row and a vertical set is called a column. The 
first subscript i always designates the number of the row in which the element lies. The sec- 
ond subscript j designates the column. For example, element a,, is in row 2 and column 3. 

The matrix in Fig. 8.3 has m rows and n columns and is said to have a dimension of 
m by n (or m X n). It is referred to as an m by n matrix. 

Matrices with row dimension m = 1, such as 


[b]=[b; by ++: bal 


are called row vectors. Note that for simplicity, the first subscript of each element is 
dropped. Also, it should be mentioned that there are times when it is desirable to employ 
a special shorthand notation to distinguish a row matrix from other types of matrices. One 
way to accomplish this is to employ special open-topped brackets, as in [b].’ 

Matrices with column dimension n = 1, such as 


cy 
C2 
Ic]=] : (8.3) 
Cin 
FIGURE 8.3 
A matrix. 
Column 3 
an an a ain | 
an an | GR) Zn | ~— ROW2 
[A] = 
L mt Am Am3 Era Amn | 


'Tn addition to special brackets, we will use case to distinguish between vectors (lowercase) and matrices 
(uppercase). 
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are referred to as column vectors. For simplicity, the second subscript is dropped. As with 
the row vector, there are occasions when it is desirable to employ a special shorthand 
notation to distinguish a column matrix from other types of matrices. One way to accom- 
plish this is to employ special brackets, as in {c}. 
Matrices where m = n are called square matrices. For example, a 3 X 3 matrix is 
a Ah ig 
[A] = |a an az 
a3; Q32 (33 
The diagonal consisting of the elements a,,, a,,, and a,, is termed the principal or main 
diagonal of the matrix. 

Square matrices are particularly important when solving sets of simultaneous linear equa- 
tions. For such systems, the number of equations (corresponding to rows) and the number of 
unknowns (corresponding to columns) must be equal for a unique solution to be possible. Con- 
sequently, square matrices of coefficients are encountered when dealing with such systems. 

There are a number of special forms of square matrices that are important and should 
be noted: 

A symmetric matrix is one where the rows equal the columns—that is, a;, = a,; for all 
i’s and j’s. For example, 


512 
[AJ=|1 3 7 
278 


is a 3 X 3 symmetric matrix. 
A diagonal matrix is a square matrix where all elements off the main diagonal are 
equal to zero, as in 


Note that where large blocks of elements are zero, they are left blank. 
An identity matrix is a diagonal matrix where all elements on the main diagonal are 
equal to 1, as in 


The identity matrix has properties similar to unity. That is, 
[AIZ] = VIIA] = [A] 


An upper triangular matrix is one where all the elements below the main diagonal are 
zero, as in 


ái i 3 
[A] = an a3 


a33 
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A lower triangular matrix is one where all elements above the main diagonal are zero, 
as in 
ayy 
[A] = |an an 
43; 437 33 
A banded matrix has all elements equal to zero, with the exception of a band centered 
on the main diagonal: 


a32 Q33 34 
a43 Agg 


The preceding matrix has a bandwidth of 3 and is given a special name—the tridiagonal 
matrix. 


8.1.2 Matrix Operating Rules 


Now that we have specified what we mean by a matrix, we can define some operating rules 
that govern its use. Two m by n matrices are equal if, and only if, every element in the first 
is equal to every element in the second—that is, [A] = [B] if a; = bij for all i and j. 

Addition of two matrices, say, [A] and [B], is accomplished by adding corresponding 
terms in each matrix. The elements of the resulting matrix [C] are computed as 


Cy = ay + iy 
fori=1,2,...,mandj=1,2,...,n. Similarly, the subtraction of two matrices, say, 
[E] minus [F], is obtained by subtracting corresponding terms, as in 

dig = ej- fij 
fori=1,2,...,mandj=1,2,...,n. It follows directly from the preceding definitions 


that addition and subtraction can be performed only between matrices having the same 
dimensions. 
Both addition and subtraction are commutative: 


and associative: 
([A] + [B]) + [C] = [A] + (B] + [C] 
The multiplication of a matrix [A] by a scalar g is obtained by multiplying every 
element of [A] by g. For example, for a 3 x 3 matrix: 
84), 840 8443 
[D] = g[A] = | 84, gan gas 


§43,; 8đ32 233 
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FIGURE 8.4 


Visual depiction of how the rows and columns line up in 
matrix multiplication. 


Ale [BI x: = [CVinxa 


| A 
- 
Interior dimensions 
are equal, 
multiplication 
is possible 


[ 3x5+1x7=22 


y 
Exterior dimensions define 
the dimensions of the result 


FIGURE 8.5 
Matrix multiplication can be performed only if 
the inner dimensions are equal. 


The product of two matrices is represented as [C] = [A][B], where the elements of [C] 
are defined as 


ey= Dyn? (8.4) 
where n = the column dimension of [A] and the row dimension of [B]. That is, the c; j ele- 
ment is obtained by adding the product of individual elements from the ith row of the first 
matrix, in this case [A], by the jth column of the second matrix [B]. Figure 8.4 depicts how 
the rows and columns line up in matrix multiplication. 

According to this definition, matrix multiplication can be performed only if the first 
matrix has as many columns as the number of rows in the second matrix. Thus, if [A] is an 
m by n matrix, [B] could be an n by / matrix. For this case, the resulting [C] matrix would 
have the dimension of m by l. However, if [B] were an m by / matrix, the multiplication 
could not be performed. Figure 8.5 provides an easy way to check whether two matrices 
can be multiplied. 

If the dimensions of the matrices are suitable, matrix multiplication is associative: 


or 


However, multiplication is not generally commutative: 
[A][B] # [B][A] 


That is, the order of matrix multiplication is important. 
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Although multiplication is possible, matrix division is not a defined operation. How- 
ever, if a matrix [A] is square and nonsingular, there is another matrix [A]~!, called the 
inverse of [A], for which 


Thus, the multiplication of a matrix by the inverse is analogous to division, in the sense that 
a number divided by itself is equal to 1. That is, multiplication of a matrix by its inverse 
leads to the identity matrix. 

The inverse of a 2 x 2 matrix can be represented simply by 


Ay 419 
[ay '=,,1, 
41122 — 412471 —a>, ay 


Similar formulas for higher-dimensional matrices are much more involved. Chapter 11 
will deal with techniques for using numerical methods and the computer to calculate the 
inverse for such systems. 

The transpose of a matrix involves transforming its rows into columns and its columns 
into rows. For example, for the 3 x 3 matrix: 


âi Ay 443 
[A] = |a an ay; 
43; A32 33 


the transpose, designated [A]’, is defined as 


Gy, an 43) 

Ta 
[A] = |an an az 
413 Q33 33 


In other words, the element a;; of the transpose is equal to the a,; element of the original 
matrix. 

The transpose has a variety of functions in matrix algebra. One simple advantage is 
that it allows a column vector to be written as a row, and vice versa. For example, if 


ci 
{c} = Cy 
Gi 
then 


{c} = le, c c] 


In addition, the transpose has numerous mathematical applications. 

A permutation matrix (also called a transposition matrix) is an identity matrix with 
rows and columns interchanged. For example, here is a permutation matrix that is con- 
structed by switching the first and third rows and columns of a 3 x 3 identity matrix: 


00 1 
[P]= |0 1 0 
100 
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EXAMPLE 8.1 


Left multiplying a matrix [A] by this matrix, as in [P][A], will switch the corresponding 
rows of [A]. Right multiplying, as in [A][P], will switch the corresponding columns. Here 
is an example of left multiplication: 


O O 1] {2 
[P][A]=]0 1 O;/8 3 —6|=|8 3 -6 
1 0 OjJ|5S 1 9 2-7 4 


The final matrix manipulation that will have utility in our discussion is augmentation. 
A matrix is augmented by the addition of a column (or columns) to the original matrix. For 
example, suppose we have a 3 x 3 matrix of coefficients. We might wish to augment this 
matrix [A] with a 3 x 3 identity matrix to yield a3 x 6 dimensional matrix: 


ay, @, ay} 1 0 0 
dy, Gy, G,|O 1 O 
a3, a3, ay|0 O0 1 
Such an expression has utility when we must perform a set of identical operations on the 


rows of two matrices. Thus, we can perform the operations on the single augmented matrix 
rather than on the two individual matrices. 


MATLAB Matrix Manipulations 


Problem Statement. The following example illustrates how a variety of matrix manipula- 
tions are implemented with MATLAB. It is best approached as a hands-on exercise on the 
computer. 

Solution. Create a 3 x 3 matrix: 


>> A= [15 6:7 4 2;-3 6 7] 


A= 
1 5 6 
7 4 2 
=3 6 7 


The transpose of [A] can be obtained using the ' operator: 


>> A' 

ans = 
1 7 =3 
5 4 6 
6 2 7 


Next we will create another 3 x 3 matrix on a row basis. First create three row vectors: 


>> x = [8 6 9]; 
>> y = [-5 8 1]; 
>> z= [4 8 2]; 
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Then we can combine these to form the matrix: 


>> B = [x; y; z] 


I 
uo 
oom 


We can add [A] and [B] together: 


>> C = A+B 

C= 
9 11 15 
2 12 3 
1 14 9 


Further, we can subtract [B] from [C] to arrive back at [A]: 


>> A=C-B 

A= 
1 5 6 
7 4 2 
-3 6 7 


Because their inner dimensions are equal, [A] and [B] can be multiplied 


>> A*B 
ans = 
7 94 26 
44 90 71 
-26 86 -7 


Note that [A] and [B] can also be multiplied on an element-by-element basis by including 
a period with the multiplication operator as in 


>> A.*B 
ans = 
8 30 54 
=35 32 2 
-12 48 14 


A 2 x 3 matrix can be set up 

>> D = [1 4 3;5 8 1]; 

If [A] is multiplied times [D], an error message will occur 
>> A*D 


??? Error using ==> mtimes 
Inner matrix dimensions must agree. 
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However, if we reverse the order of multiplication so that the inner dimensions match, 
matrix multiplication works 


>> D*A 

ans = 
20 39 35 
58 63 53 


The matrix inverse can be computed with the inv function: 
>> AI = inv(A) 


AI = 


0.2462 0.0154 -0.2154 
-0.8462 0.3846 0.6154 
0.8308 -0.3231 -0.4769 


To test that this is the correct result, the inverse can be multiplied by the original matrix to 
give the identity matrix: 


>> A*AI 

ans = 
1.0000 -0.0000 -0.0000 
0.0000 1.0000 -0.0000 
0.0000 -0.0000 1.0000 


The eye function can be used to generate an identity matrix: 


>> I = eye(3) 

I= 
1 0 0 
0 1 0 
0 0 1 


We can set up a permutation matrix to switch the first and third rows and columns of 
a3 x 3 matrix as 


>> P=[0 0 1;0 1 0;1 0 0] 


P= 
0 0 i 
0 1 0 
1 0 0 


We can then either switch the rows: 
>> PA=P*A 


PA = 
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or the columns: 


>> AP=A*P 
AP = 
6 5 1 
2 4 7 
7 6 3 


Finally, matrices can be augmented simply as in 


>> Aug = [A I] 
Aug = 


Note that the dimensions of a matrix can be determined by the size function: 


>> [n,m] = size(Aug) 


8.1.3 Representing Linear Algebraic Equations in Matrix Form 


It should be clear that matrices provide a concise notation for representing simultaneous 
linear equations. For example, a 3 x 3 set of linear equations, 


A,X + 41X + 1 3X3 = D, 
Ag4X1 + Ag X + dyxX3 = b; (8.5) 


3X1 + Az9X + G33X3 = b; 
can be expressed as 

[A}{x} = {b} (8.6) 
where [A] is the matrix of coefficients: 


ái ah 443 
[A] = |a an az 


43; 32 33 
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8.2 


{b} is the column vector of constants: 
{b} = [bi b, bs] 

and {x} is the column vector of unknowns: 
{x} = Da xn x3] 


Recall the definition of matrix multiplication [Eq. (8.4)] to convince yourself that 
Eqs. (8.5) and (8.6) are equivalent. Also, realize that Eq. (8.6) is a valid matrix multiplica- 
tion because the number of columns n of the first matrix [A] is equal to the number of rows 
n of the second matrix {x}. 

This part of the book is devoted to solving Eq. (8.6) for {x}. A formal way to obtain a 
solution using matrix algebra is to multiply each side of the equation by the inverse of [A] 
to yield 


[AIA] {x} = [A] {b} 
Because [A]7'[A] equals the identity matrix, the equation becomes 
{x} = [AT {b} (8.7) 


Therefore, the equation has been solved for {x}. This is another example of how the in- 
verse plays a role in matrix algebra that is similar to division. It should be noted that this 
is not a very efficient way to solve a system of equations. Thus, other approaches are em- 
ployed in numerical algorithms. However, as discussed in Sec. 11.1.2, the matrix inverse 
itself has great value in the engineering analyses of such systems. 

It should be noted that systems with more equations (rows) than unknowns (columns), 
m > n, are said to be overdetermined. A typical example is least-squares regression where 
an equation with n coefficients is fit to m data points (x, y). Conversely, systems with less 
equations than unknowns, m < n, are said to be underdetermined. A typical example of 
underdetermined systems is numerical optimization. 


SOLVING LINEAR ALGEBRAIC EQUATIONS WITH MATLAB 


MATLAB provides two direct ways to solve systems of linear algebraic equations. The 
most efficient way is to employ the backslash, or “left-division,” operator as in 


>> x = A\b 
The second is to use matrix inversion: 
>> x = inv(A)*b 


As stated at the end of Sec. 8.1.3, the matrix inverse solution is less efficient than using the 
backslash. Both options are illustrated in the following example. 


8.2 | SOLVING LINEAR ALGEBRAIC EQUATIONS WITH MATLAB 239 


EXAMPLE 8.2 Solving the Bungee Jumper Problem with MATLAB 


Problem Statement. Use MATLAB to solve the bungee jumper problem described at the 
beginning of this chapter. The parameters for the problem are 


Spring Constant Unstretched Cord 
Jumper Mass (kg) (N/m) Length (m) 
Top (1) 60 50 20 
Middle (2) 70 100 20 
Bottom (3) 80 50 20 


Solution. Substituting these parameter values into Eq. (8.2) gives 


150 -100 0] (x,) (588.6 
-100 150 —50 | {x,} = {686.7 
0 -50 50] {x] (784.8 


Start up MATLAB and enter the coefficient matrix and the right-hand-side vector: 
>> K = [150 -100 0;-100 150 -50;0 -50 50] 


K = 
150 -100 0 
-100 150 -50 
0 -50 50 


>> mg = [588.6; 686.7; 784.8] 


mg = 

588. 6000 
686. 7000 
784.8000 


Employing left division yields 
>> x = K\mg 


x= 
41.2020 
55.9170 
71.6130 


Alternatively, multiplying the inverse of the coefficient matrix by the right-hand-side vec- 
tor gives the same result: 


>> x = inv(K)*mg 


xX = 
41.2020 
55.9170 
71.6130 
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Because the jumpers were connected by 20-m cords, their initial o ZA MA , 
positions relative to the platform is | 
>> xi = [20;40;60]; L © 
Thus, their final positions can be calculated as | 
40- 
>> xf = xtxi j 
xf = 
61.2020 r O Oo 
95.9170 
131.6130 Belle 
The results, which are displayed in Fig. 8.6, make sense. The 
first cord is extended the longest because it has a lower spring H °? 
constant and is subject to the most weight (all three jumpers). 
Notice that the second and third cords are extended about the pale 
same amount. Because it is subject to the weight of two jumpers, 
one might expect the second cord to be extended longer than the O 
third. However, because it is stiffer (i.e., it has a higher spring 
constant), it stretches less than expected based on the weight it (a) (6) 
carries. 
FIGURE 8.6 


Positions of three 
individuals connected 
by bungee cords. 

(a) Unstretched and 
(b) stretched. 
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Background. Recall that in Chap. 1 (Table 1.1), we summarized some models and asso- 
ciated conservation laws that figure prominently in engineering. As in Fig. 8.7, each model 
represents a system of interacting elements. Consequently, steady-state balances derived 
from the conservation laws yield systems of simultaneous equations. In many cases, such 
systems are linear and hence can be expressed in matrix form. The present case study 
focuses on one such application: circuit analysis. 

A common problem in electrical engineering involves determining the currents and 
voltages at various locations in resistor circuits. These problems are solved using Kirch- 
hoff’s current and voltage rules. The current (or point) rule states that the algebraic sum of 
all currents entering a node must be zero (Fig. 8.8a), or 


Bret (8.8) 


where all current entering the node is considered positive in sign. The current rule is an 
application of the principle of conservation of charge (recall Table 1.1). 
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8.3 CASE STUDY continued 


Structure 


7 1 


(a) Chemical engineering (b) Civil engineering 


NW, NW 
y BE v, Me v 
= AM 
ke t2 = 
Circuit (b) 
(c) Electrical engineering (d) Mechanical engineering 
FIGURE 8.8 
FIGURE 8.7 Schematic representations 
Engineering systems which, at steady state, can be modeled with linear of (a) Kirchhoffs current 


algebraic equations. 


rule and (b) Ohm’s law. 


The voltage (or loop) rule specifies that the algebraic sum of the potential differences 
(1.e., voltage changes) in any loop must equal zero. For a resistor circuit, this is expressed as 


Yé-DLiR=0 (8.9) 


where £é is the emf (electromotive force) of the voltage sources, and R is the resistance of 
any resistors on the loop. Note that the second term derives from Ohm’s law (Fig. 8.8), 
which states that the voltage drop across an ideal resistor is equal to the product of the 
current and the resistance. Kirchhoff’s voltage rule is an expression of the conservation of 
energy. 


Solution. Application of these rules results in systems of simultaneous linear algebraic 
equations because the various loops within a circuit are interconnected. For example, con- 
sider the circuit shown in Fig. 8.9. The currents associated with this circuit are unknown 
both in magnitude and direction. This presents no great difficulty because one simply 
assumes a direction for each current. If the resultant solution from Kirchhoff’s laws is 
negative, then the assumed direction was incorrect. For example, Fig. 8.10 shows some 
assumed currents. 
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8.3 CASESTUDY continued 


3 R=10Q 2 RS5Q 1 
WN, A\\———0 V, = 200 V 
3 1 
VW WW——— o 
Š R= 5O Sr = (9, — m= 
139 liz 
fie a ale 
54 65 
NVV M\\——O V, = 0 V <= -= 
4 R=59 >? R=20Q & M WA O 
4 5 6 
FIGURE 8.9 
A resistor circuit to be solved using simultaneous FIGURE 8.10 
linear algebraic equations. Assumed current directions. 


Given these assumptions, Kirchhoff’s current rule is applied at each node to yield 


ig + is + i = 0 
igs — İs2 — İs, = 0 
143 — i2 = 0 
is, — ig =O 
Application of the voltage rule to each of the two loops gives 
—tsqRsq — UygRgs — i32R33 + isos = 0 
—issR6es — is2Rs2 + i2Rı2 — 200 = 0 


or, substituting the resistances from Fig. 8.9 and bringing constants to the right-hand side, 


—15is4 — 5i43 — 10i, + 10i; = 0 
—2igs — at 5i,, = 200 


Therefore, the problem amounts to solving six equations with six unknown currents. These 
equations can be expressed in matrix form as 


ie © © l/t 0 
eat O ni Ollie. 0 
0 0 =1 o o illalle 
© © 0 © 1 illz 1o 
© i0 = © -15 5| iz 0 
5 10 0 æ © olla! eo 
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8.3 CASE STUDY continued 


Although impractical to solve by hand, this system is easily handled by MATLAB. 
The solution is 


S =il(o) (0) 240) (0) @)l|z 
>> b=[0 000 0 200]'; 
>> current=A\b 


current = 
6.1538 
-4.6154 
=1.5385 
=6. 1538 
Sil sete) 
il ayetets) 


Thus, with proper interpretation of the signs of the result, the circuit currents and volt- 
ages are as shown in Fig. 8.11. The advantages of using MATLAB for problems of this 
type should be evident. 


FIGURE 8.11 
The solution for currents and voltages obtained using MATLAB. 


© V = 200 


V =153.85 V = 169.23 
WV 


i= 6.1538 


V = 146.15 V = 123.08 
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PROBLEMS 


8.1 Given a square matrix [A], write a single line MATLAB 
command that will create a new matrix [Aug] that consists of 
the original matrix [A] augmented by an identity matrix [Z]. 
8.2 A number of matrices are defined as 


4 7 43 7 
[A] = | 4 [B] = | 2 | 
5 6 20 4 


3 
_f9 4 3 -6 
TEG ipi =| -1 7 | 


15 8 
[E]=|7 2 3 
4 0 6 


m= f 0 ' 


173 [G] = [764] 


Answer the following questions regarding these matrices: 
(a) What are the dimensions of the matrices? 

(b) Identify the square, column, and row matrices. 

(c) What are the values of the elements: a,,, baz; d), €22 


fix 812? 
(d) Perform the following operations: 


(1) [E] + [B] (2) [A] + [F] (3) [B] — [Æ] 
(4) 7 x [B] 610" (6) [E] x [B] 
(7) [BIx[A] (8) [D]" (9) [A] x {C} 
(10) [7] x [B] (QD [E x [E] (12) {C}" x {C} 


8.3 Write the following set of equations in matrix form: 

50 = 5x, — 7x, 

4x, + 7x; +30 = 0 

xX, — 7x, = 40 — 3x, + 5x, 
Use MATLAB to solve for the unknowns. In addition, use 
it to compute the transpose and the inverse of the coefficient 


matrix. 
8.4 Three matrices are defined as 


6 -l1 
m| y] mi S eg a 


(a) Perform all possible multiplications that can be com- 
puted between pairs of these matrices. 
(b) Justify why the remaining pairs cannot be multiplied. 


(c) Use the results of (a) to illustrate why the order of mul- 
tiplication is important. 
8.5 Solve the following system with MATLAB: 


ies | zy eae 

=i il {21} “3 

8.6 Develop, debug, and test your own M-file to multiply 
two matrices—that is, [X] = [Y][Z], where [Y] is m by n 
and [Z] is n by p. Employ for. ..end loops to implement the 
multiplication and include error traps to flag bad cases. Test 
the program using the matrices from Prob. 8.4. 

8.7 Develop, debug, and test your own M-file to gener- 
ate the transpose of a matrix. Employ for...end loops 
to implement the transpose. Test it on the matrices from 
Prob. 8.4. 

8.8 Develop, debug, and test your own M-file function to 
switch the rows of a matrix using a permutation matrix. The 
first lines of the function should be as follows: 


function B = permut(A,r1,r2) 

% Permut: Switch rows of matrix A 
% with a permutation matrix 

% B = permut(A,r1,r2) 

% input: 

% A = original matrix 

% r1, r2 = rows to be switched 

% output: 

% B = matrix with rows switched 


Include error traps for erroneous inputs (e.g., user specifies 
rows that exceed the dimensions of the original matrix). 

8.9 Five reactors linked by pipes are shown in Fig. P8.9. 
The rate of mass flow through each pipe is computed as the 
product of flow (Q) and concentration (c). At steady state, 
the mass flow into and out of each reactor must be equal. 
For example, for the first reactor, a mass balance can be 
written as 


Qo1Co1 + Q3iC3 = Q156, + Qc) 


Write mass balances for the remaining reactors in Fig. P8.9 
and express the equations in matrix form. Then use MATLAB 
to solve for the concentrations in each reactor. 

8.10 An important problem in structural engineering is 
that of finding the forces in a statically determinate truss 
(Fig. P8.10). This type of structure can be described as a 
system of coupled linear algebraic equations derived from 
force balances. The sum of the forces in both horizontal and 
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Co = 20 


FIGURE P8.9 


vertical directions must be zero at each node, because the sys- 
tem is at rest. Therefore, for node 1: 


DF, = 0 = -F cos 30° + F; cos 60° + Fy, 
=F, = 0 = -F sin 30° — F; sin 60° + F,,, 


for node 2: 


SF, =0=F, + F cos 30° + Fy, + A, 


YF =0=F, sin 30° + F,, + V, 


FIGURE P8.10 


for node 3: 


dF, = 0 = F, — F, cos 60° + F3, 
dF, = 0 = F; sin 60° + F3,, + V3 


where F’,,, is the external horizontal force applied to node i 
(where a positive force is from left to right) and F; „is the ex- 
ternal vertical force applied to node i (where a positive force 
is upward). Thus, in this problem, the 2000-N downward 
force on node | corresponds to F,„ = —2000. For this case, 
all other F;,,’s and F;„ s are zero. Express this set of linear 
algebraic equations in matrix form and then use MATLAB 
to solve for the unknowns. 

8.11 Consider the three mass-four spring system in 
Fig. P8.11. Determining the equations of motion from =F, = 
ma, for each mass using its free-body diagram results in the 


= x xn 5 
W k A k; ky 
m LR m% bom 


FIGURE P8.11 
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following differential equations: 


k, +k k 
n+ Ja -a0 


mı 


7 k k, +k k 
xX, - a Xi +| Ti x — A x,=0 


N k k; +k 
Oa 


where k, = k, = 10 N/m, k, = k, = 30 N/m, and m, = m, = 
m, = 1 kg. The three equations can be written in matrix form: 


0 = {Acceleration vector} 


+ [k/m matrix] {displacement vector x} 


At a specific time where x, = 0.05 m, x, = 0.04 m, and 
x; = 0.03 m, this forms a tridiagonal matrix. Use MATLAB 
to solve for the acceleration of each mass. 

8.12 Perform the same computation as in Example 8.2, but 
use five jumpers with the following characteristics: 


Spring Unstretched 
Mass Constant Cord 
Jumper (kg) (N/m) Length (m) 
1 55 80 10 
2 75 50 10 
3 60 70 10 
4 75 100 10 
5 90 20 10 


3 R=30'0 2 


V, = 150 volts 


I= 5) O) 
VAVA ZO Vs =0 volts 


FIGURE P8.14 


8.13 Three masses are suspended vertically by a series of 
identical springs where mass 1 is at the top and mass 3 is 
at the bottom. If g = 9.81 m/s”, m, = 2 kg, m, = 3 kg, m, = 
2.5 kg, and the k’s = 10 kg/s?, use MATLAB to solve for the 
displacements x. 
8.14 Perform the same computation as in Sec. 8.3, but for 
the circuit in Fig. P8.14. 
8.15 Perform the same computation as in Sec. 8.3, but for 
the circuit in Fig. P8.15. 
8.16 Besides solving simultaneous equations, linear algebra 
has lots of other applications in engineering and science. An 
example from computer graphics involves rotating an ob- 
ject in Euclidean space. The following rotation matrix can 
be employed to rotate a group of points counter-clockwise 
through an angle @ about the origin of a Cartesian coordinate 
system, 

R= Fa 0 —sin °| 

sin@ cos@ 


R=35Q , 


Wi 


MM — o J, = 20 volts 


FIGURE P8.15 


O V; = 140 volts 
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To do this, each point’s position must be represented by a 
column vector v, containing the coordinates of the point. For 
example, here are vectors for the x and y coordinates of the 
rectangle in Fig. P8.16 


x=[1441]; y=[114 4]; 


The rotated vector is then generated with matrix multipli- 
cation: [R]{v}. Develop a MATLAB function to perform 
this operation and display the initial and the rotated points 
as filled shapes on the same graph. Here is a script to test 
your function: 


clc;clf; format compact 
x= [1441]; y=[1144]; 
[xt, yt] = Rotate2D(45, x, y); 


and here is a skeleton of the function 


function [xr, yr] = Rotate2D(thetad, x, y) 

% two dimensional rotation 2D rotate Cartesian 
% [xr, yr] = rot2d(thetad, x, y) 

% Rotation of a two-dimensional object the 
Cartesian coordinates 

% of which are contained in the vectors x and y. 
% input: 

% thetad = angle of rotation (degrees) 

% x = vector containing objects x coordinates 
% y = vector containing objects y coordinates 
% output: 

% xr = vector containing objects rotated x 
coordinates 

% yr = vector containing objects rotated y 
coordinates 


% convert angle to radians and set up rotation 
matrix 


Ya 


% close shape 


% plot original object 
hold on, grid on 

% rotate shape 

% plot rotated object 


hold off 


(1, 4) (4, 4) 


(1, 1) (4.1) 


BY 


FIGURE P8.16 
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CHAPTER OBJECTIVES 


The primary objective of this chapter is to describe the Gauss elimination algorithm 
for solving linear algebraic equations. Specific objectives and topics covered are 


Knowing how to solve small sets of linear equations with the graphical method 
and Cramer’s rule. 

Understanding how to implement forward elimination and back substitution as in 
Gauss elimination. 

Understanding how to count flops to evaluate the efficiency of an algorithm. 
Understanding the concepts of singularity and ill-condition. 

Understanding how partial pivoting is implemented and how it differs from 
complete pivoting. 

Knowing how to compute the determinant as part of the Gauss elimination 
algorithm with partial pivoting. 

Recognizing how the banded structure of a tridiagonal system can be exploited 
to obtain extremely efficient solutions. 


t the end of Chap. 8, we stated that MATLAB provides two simple and direct 
methods for solving systems of linear algebraic equations: left division, 


>> x = A\b 
and matrix inversion, 
>> x = inv(A)*b 


Chapters 9 and 10 provide background on how such solutions are obtained. This mate- 
rial is included to provide insight into how MATLAB operates. In addition, it is intended to 
show how you can build your own solution algorithms in computational environments that 
do not have MATLAB’s built-in capabilities. 
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91 


The technique described in this chapter is called Gauss elimination because it involves 
combining equations to eliminate unknowns. Although it is one of the earliest methods 
for solving simultaneous equations, it remains among the most important algorithms in 
use today and is the basis for linear equation solving on many popular software packages 
including MATLAB. 


SOLVING SMALL NUMBERS OF EQUATIONS 


Before proceeding to Gauss elimination, we will describe several methods that are ap- 
propriate for solving small (n < 3) sets of simultaneous equations and that do not require a 
computer. These are the graphical method, Cramer’ s rule, and the elimination of unknowns. 


9.1.1 The Graphical Method 


A graphical solution is obtainable for two linear equations by plotting them on Cartesian 
coordinates with one axis corresponding to x, and the other to x,. Because the equations 
are linear, each equation will plot as a straight line. For example, suppose that we have the 
following equations: 


3x, + 2x, = 18 
=x + 2x, =2 


If we assume that x, is the abscissa, we can solve each of these equations for x,: 


x= 3x, +9 
m=4x,+1 


The equations are now in the form of straight lines—that is, x, = (slope) x, + intercept. 
When these equations are graphed, the values of x, and x, at the intersection of the lines 
represent the solution (Fig. 9.1). For this case, the solution is x, = 4 and x, = 3. 

For three simultaneous equations, each equation would be represented by a plane in 
a three-dimensional coordinate system. The point where the three planes intersect would 
represent the solution. Beyond three equations, graphical methods break down and, conse- 
quently, have little practical value for solving simultaneous equations. However, they are 
useful in visualizing properties of the solutions. 

For example, Fig. 9.2 depicts three cases that can pose problems when solving sets of 
linear equations. Fig. 9.2a shows the case where the two equations represent parallel lines. 
For such situations, there is no solution because the lines never cross. Figure 9.2b depicts 
the case where the two lines are coincident. For such situations there is an infinite number 
of solutions. Both types of systems are said to be singular. 

In addition, systems that are very close to being singular (Fig. 9.2c) can also cause 
problems. These systems are said to be i//-conditioned. Graphically, this corresponds to the 
fact that it is difficult to identify the exact point at which the lines intersect. Ill-conditioned 
systems will also pose problems when they are encountered during the numerical solution 
of linear equations. This is because they will be extremely sensitive to roundoff error. 
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Solution: x, = 4; x, = 3 


a a al eee ee 
(0) 2 4 @ 2 
FIGURE 9.1 


Graphical solution of a set of two simultaneous linear algebraic equations. The intersection of 
the lines represents the solution. 


(a) (b) (c) 


FIGURE 9.2 
Graphical depiction of singular and ill-conditioned systems: (a) no solution, (b) infinite solutions, and 
(c) illconditioned system where the slopes are so close that the point of intersection is difficult to detect visually. 


9.1.2 Determinants and Cramer’s Rule 


Cramer’s rule is another solution technique that is best suited to small numbers of equa- 
tions. Before describing this method, we will briefly review the concept of the determinant, 
which is used to implement Cramer’s rule. In addition, the determinant has relevance to the 
evaluation of the ill-conditioning of a matrix. 
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EXAMPLE 9.1 


Determinants. The determinant can be illustrated for a set of three equations: 
[A]{x} = {b} 
where [A] is the coefficient matrix 
Qij Aiz 43 
[A] =] 41 an az 
a3; 437 433 
The determinant of this system is formed from the coefficients of [A] and is represented as 
ai an ag 
D=|a; an yy 
43, 439 433 


Although the determinant D and the coefficient matrix [A] are composed of the same 
elements, they are completely different mathematical concepts. That is why they are dis- 
tinguished visually by using brackets to enclose the matrix and straight lines to enclose the 
determinant. In contrast to a matrix, the determinant is a single number. For example, the 
value of the determinant for two simultaneous equations 


ay; ar 


a an 
is calculated by 

D = 4) )y) — 412471 
For the third-order case, the determinant can be computed as 


Ay, n 
a31 32 


a a a a 
2 43 21 493 
D=a; + 413 


a32 33 a3; 433 (9.1) 


where the 2 x 2 determinants are called minors. 


Determinants 


Problem Statement. Compute values for the determinants of the systems represented in 
Figs. 9.1 and 9.2. 


Solution. For Fig. 9.1: 


For Fig. 9 .2a: 


1 


p3 hog 


For Fig. 9.2b: 
1 


z l 1 
D=| 4 ==> (2) = 1l(-1)=0 
-1 2 2 
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For Fig. 9.2c: 


D= 


i 1 1 —2.3 
2 =—4 ae | a eee 
23 i = z D i( )= 0.04 


EXAMPLE 9.2 


In the foregoing example, the singular systems had zero determinants. Additionally, 
the results suggest that the system that is almost singular (Fig. 9.2c) has a determinant 
that is close to zero. These ideas will be pursued further in our subsequent discussion of 
ill-conditioning in Chap. 11. 


Cramer's Rule. This rule states that each unknown in a system of linear algebraic equations 
may be expressed as a fraction of two determinants with denominator D and with the nu- 
merator obtained from D by replacing the column of coefficients of the unknown in question 


by the constants b,, b,..., b,. For example, for three equations, x, would be computed as 
bi an ag 
by an az 
by a, a; 
D 


X% = 


Cramer’s Rule 
Problem Statement. Use Cramer’s rule to solve 


0.3x, + 0.52x,+ x, = —0.01 
0.5x, + xX + 1.9x, = 0.67 
0.1x, + 0.3 x, + 0.5x; = —0.44 


Solution. The determinant D can be evaluated as [Eq. (9.1)]: 


1 19 0.5 1.9 0.5 1 
Dats ie 0.5| 7052 [o1 0.5 0.1 03 | ee 
The solution can be calculated as 
-0.01 0.52 1 
0.67 1 1.9 
—0.44 0.3 0.5 
= — 0.03278 _ _ 
-0.0022 * 0002 = 
0.3 -0.01 1 
0.5 0.67 1.9 
2! 0.1 -0.44 0.5 _ 0.0649 = 
2 —0.0022 —0.0022 ` 
0.3 0.52 —0.01 
05 1 0.67 
oe 0.1 03 -0.44 _ 0.04356 _ ie 
d —0.0022 —0.0022 i 
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The det Function. The determinant can be computed directly in MATLAB with the det 
function. For example, using the system from the previous example: 


>> A=[0.3 0.52 1;0.5 1 1.9;0.1 0.3 0.5]; 
>> D=det(A) 


p= 
-0.0022 


Cramer’s rule can be applied to compute x, as in 
>> A(:,1)=[-0.01;0.67;-0.44] 


A= 
-0.0100 0.5200 1.0000 
0.6700 1.0000 1.9000 
-0.4400 0.3000 0.5000 


>> xl=det(A)/D 
x1 = 


-14.9000 


For more than three equations, Cramer’s rule becomes impractical because, as the 
number of equations increases, the determinants are time consuming to evaluate by hand 
(or by computer). Consequently, more efficient alternatives are used. Some of these alter- 
natives are based on the last noncomputer solution technique covered in Sec. 9.1.3—the 
elimination of unknowns. 


9.1.3 Elimination of Unknowns 


The elimination of unknowns by combining equations is an algebraic approach that can be 
illustrated for a set of two equations: 

11X1 + a% = bı (9.2) 

aX; + 43X = by (9.3) 
The basic strategy is to multiply the equations by constants so that one of the unknowns 
will be eliminated when the two equations are combined. The result is a single equation 
that can be solved for the remaining unknown. This value can then be substituted into 
either of the original equations to compute the other variable. 

For example, Eq. (9.2) might be multiplied by a,, and Eq. (9.3) by a,, to give 

Ay {Gy ,X1 + 4141X = azb] (9.4) 

A 1A,X1 + Gy Ay Xy = A) ,b, (9.5) 
Subtracting Eq. (9.4) from Eq. (9.5) will, therefore, eliminate the x, term from the equations 
to yield 


411422X7 — 2141X = AD, — AyD, 


which can be solved for 


_ Dy — AyD, 


X, = > (9.6) 
2 dily = Ay Ay 


254 


GAUSS ELIMINATION 


9.2 


Equation (9.6) can then be substituted into Eq. (9.2), which can be solved for 


_ AyD, — Ayyby 


x = (9.7) 
1 djan — 4y4 419 


Notice that Eqs. (9.6) and (9.7) follow directly from Cramer’s rule: 


by ap 
ay, ap 11922 21412 
Dy, an 
ay, b 

= ay, b = Lube = aab, 
ay, ap 11422 21412 
Ay, an 


The elimination of unknowns can be extended to systems with more than two or three 
equations. However, the numerous calculations that are required for larger systems make 
the method extremely tedious to implement by hand. However, as described in Sec. 9.2, the 
technique can be formalized and readily programmed for the computer. 


NAIVE GAUSS ELIMINATION 


In Sec. 9.1.3, the elimination of unknowns was used to solve a pair of simultaneous equa- 
tions. The procedure consisted of two steps (Fig. 9.3): 


1. The equations were manipulated to eliminate one of the unknowns from the equations. 
The result of this elimination step was that we had one equation with one unknown. 

2. Consequently, this equation could be solved directly and the result back-substituted 
into one of the original equations to solve for the remaining unknown. 


This basic approach can be extended to large sets of equations by developing a system- 
atic scheme or algorithm to eliminate unknowns and to back-substitute. Gauss elimination 
is the most basic of these schemes. 

This section includes the systematic techniques for forward elimination and back 
substitution that comprise Gauss elimination. Although these techniques are ideally 
suited for implementation on computers, some modifications will be required to obtain 
a reliable algorithm. In particular, the computer program must avoid division by zero. 
The following method is called “naive” Gauss elimination because it does not avoid this 
problem. Section 9.3 will deal with the additional features required for an effective com- 
puter program. 

The approach is designed to solve a general set of n equations: 


Ay |X, + AypXy + 443%, + +++ +4),X, =D, (9.8a) 
Ay) X) + Ag Xy + Ay3X3 + +++ + A,X, = ba (9.8b) 
(9.8c) 


an% + an2X2 + An3%3 a a 2 Bi Xp = b, 
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[an an ag | bi] 
i 
ar Ay a3 ! b, 
i 
ei 432433 3 bs | 
| | (a) Forward 
elimination 
4, an 43 i bi 
i 
adn a» | bh 
i 
dae b’; 
a E. + 
E ” J 
x3 = b"3/a"33 
b) Back 
x = (by — a'3x3)/a’ s ( has 
aap 23%3)/ a% substitution 
xı = (b; = a13%3 — a12%2)/ a11 


2 


FIGURE 9.3 
The two phases of Gauss elimination: (a) forward elimination and (b) back substitution. 


As was the case with the solution of two equations, the technique for n equations consists 
of two phases: elimination of unknowns and solution through back substitution. 


Forward Elimination of Unknowns. The first phase is designed to reduce the set of equa- 
tions to an upper triangular system (Fig. 9.3a). The initial step will be to eliminate the first 
unknown x, from the second through the nth equations. To do this, multiply Eq. (9.8a) by 
y,/a,, to give 

a1 43 


a a 
21 21 = 
Ay ,X, + Ta ai2X + ay dix; +: + an aX, = an b, (9.9) 


This equation can be subtracted from Eq. (9.85) to give 


el sia = aA Sposa lp 
an aj 2 X2 aan a “in X, = 02 Gy! 


or 


1 ul ¢ 
aX +: + a), X, = ba 


where the prime indicates that the elements have been changed from their original values. 

The procedure is then repeated for the remaining equations. For instance, Eq. (9.8a) 
can be multiplied by a;,/a,, and the result subtracted from the third equation. Repeating 
the procedure for the remaining equations results in the following modified system: 


AiiXi + aiaXy + aja X3 +++ + a,x, = bi (9.10a) 


Ay Xz + Gy, X3 + + + + + ap, Xn = dy (9.10b) 
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Ay, Xa + dzz X3 +*+ + a3 Xn =D, (9.10c) 


3n“ n 


a'o Xy + ala X3 +: H apn Xn = b, (9.10d) 


nn” n 


For the foregoing steps, Eq. (9.8a) is called the pivot equation and a,, is called the 
pivot element. Note that the process of multiplying the first row by a,,/a,, is equivalent to 
dividing it by a,, and multiplying it by a,,. Sometimes the division operation is referred to 
as normalization. We make this distinction because a zero pivot element can interfere with 
normalization by causing a division by zero. We will return to this important issue after we 
complete our description of naive Gauss elimination. 

The next step is to eliminate x, from Eq. (9.10c) through (9.10d). To do this, mul- 
tiply Eq. (9.10b) by a;,/a), and subtract the result from Eq. (9.10c). Perform a similar 
elimination for the remaining equations to yield 


Ay jXy + AypXy + Ay 3X3 +++ + + A,X, = Dy 
t $ t ' 

a X + a33 X3 +*+ + a, Xp = dy 

a33 X3 +*+ + a3, X, = b3 

4,3%3 crests Any Xn = b; 


where the double prime indicates that the elements have been modified twice. 

The procedure can be continued using the remaining pivot equations. The final 
manipulation in the sequence is to use the (n — 1)th equation to eliminate the x,_, term 
from the nth equation. At this point, the system will have been transformed to an upper 
triangular system: 


411X] + aja% + A13X3 + +++ + 4),X, = D) (9.11a) 
Ay Xy + a3 X3 +++ + G5,.x, = by (9.11b) 
A,X, t +a5,x, =, (9.11c) 

a, X, = bG) (9.11d) 


Back Substitution. Equation (9.11d) can now be solved for x,,: 


pe 
el (9.12) 


X 


nn 


This result can be back-substituted into the (n — 1)th equation to solve for x,_,. The 
procedure, which is repeated to evaluate the remaining x’s, can be represented by the 
following formula: 


y= — fori=n— l,n—2,...,1 (9:13) 
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EXAMPLE 9.3 


Naive Gauss Elimination 


Problem Statement. Use Gauss elimination to solve 


3x, —0.1x,—-0.2x,= 7.85 (E9.3.1) 
O.1x,+ 7x, — 0.3x; = —19.3 (E9.3.2) 
0.3x; — 0.2x, + 10x;= 71.4 (E9.3.3) 


Solution. The first part of the procedure is forward elimination. Multiply Eq. (E9.3.1) 
by 0.1/3 and subtract the result from Eq. (E9.3.2) to give 


7.00333x, — 0.293333x, = —19.5617 


Then multiply Eq. (E9.3.1) by 0.3/3 and subtract it from Eq. (E9.3.3). After these opera- 
tions, the set of equations is 


3x, — 0.1x, — 0.2x,= 7.85 (£9.3.4) 
7.00333x, — 0.293333x, = —19.5617 (E9.3.5) 
—0.190000x, + 10.0200x,= 70.6150 (9.3.6) 


To complete the forward elimination, x, must be removed from Eq. (E9.3.6). To ac- 
complish this, multiply Eq. (E9.3.5) by —0.190000/7.00333 and subtract the result from 
Eq. (E9.3.6). This eliminates x, from the third equation and reduces the system to an upper 
triangular form, as in 


3x,- Olx- 0.2x,;= 7.85 (£9.3.7) 
7.00333x, — 0.293333x, = —19.5617 (E9.3.8) 
10.0120x,= 70.0843 (E9.3.9) 


We can now solve these equations by back substitution. First, Eq. (E9.3.9) can be 
solved for 


x, = 10.0843 
>“ 10.0120 
This result can be back-substituted into Eq. (E9.3.8), which can then be solved for 
ae —19.5617 + 0.293333(7.00003) _ 
= 7.00333 E 
Finally, x, = 7.00003 and x, = —2.50000 can be substituted back into Eq. (E9.3.7), which 
can be solved for 
Pe 7.85 + 0.1(—2.50000) + 0.2(7.00003) 
‘= 
3 
Although there is a slight round off error, the results are very close to the exact solution 


of x, = 3, x, = —2.5, and x, = 7. This can be verified by substituting the results into the 
original equation set: 


= 7.00003 


—2.50000 


= 3.00000 


3(3) — 0.1(—2.5) — 0.2(7.00003) = 7.84999 = 7.85 
0.1(3) + 7(—2.5) — 0.3(7.00003) = —19.30000 = —19.3 
0.3(3) — 0.2(—2.5) + 10(7.00003) = 71.4003 = 71.4 
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function x = Gaussllaive(A,b) 

% GaussNaive: naive Gauss elimination 

% x= GaussNaive(A,b): Gauss elimination without pivoting. 
% input: 

% A = coefficient matrix 

% b= right hand side vector 

% output: 

% xX = solution vector 


[m,n] = size(A); 
if m-=n, error('Matrix A must be square'); end 
nb = n+1; 
Aug = [A b]; 
% forward elimination 
for k = 1:n-1 
for i = k+1:n 
factor = Aug(i,k)/Aug(k,k); 
Aug(i,k:nb) = Aug(i,k:nb)-factor*Aug(k ,k:nb) ; 
end 
end 
% back substitution 
x = zeros(n,1); 
x(n) = Aug(n,nb)/Aug(n,n); 
Tele Vy = (eile ale) 
x(i) = (Aug(i,nb)-Aug(i, i+1:n)*x(i+1:n))/Aug(i,7); 
end 


FIGURE 9.4 
An M-file to implement naive Gauss elimination. 


9.2.1 MATLAB N-file: GaussNaive 


An M-file that implements naive Gauss elimination is listed in Fig. 9.4. Notice that the 
coefficient matrix A and the right-hand-side vector b are combined in the augmented matrix 
Aug. Thus, the operations are performed on Aug rather than separately on A and b. 

Two nested loops provide a concise representation of the forward elimination step. 
An outer loop moves down the matrix from one pivot row to the next. The inner loop 
moves below the pivot row to each of the subsequent rows where elimination is to take 
place. Finally, the actual elimination is represented by a single line that takes advantage of 
MATLAB’s ability to perform matrix operations. 

The back-substitution step follows directly from Eqs. (9.12) and (9.13). Again, 
MATLAB’s ability to perform matrix operations allows Eq. (9.13) to be programmed as a 
single line. 


9.2.2 Operation Counting 


The execution time of Gauss elimination depends on the amount of floating-point operations 
(or flops) involved in the algorithm. On modern computers using math coprocessors, the time 
consumed to perform addition/subtraction and multiplication/division is about the same. 
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Therefore, totaling up these operations provides insight into which parts of the algorithm are 
most time consuming and how computation time increases as the system gets larger. 

Before analyzing naive Gauss elimination, we will first define some quantities that 
facilitate operation counting: 


m m m m m 

Less YfO+sO= VAO+ Vs (9.14a,b) 
i=1 i=1 i=1 i=1 i=1 

m m 

Mialtit¢i¢-::-+l=m YP l=m-k+1 (9.14¢,d) 
i=1 i=k 

m 2 

Dials 243te tm a ED om (9.14e) 
i=1 

m 3 

DPR EPH o tm? = Met em) L E oom) (9.14f) 


i=1 
where O(m”) means “terms of order m” and lower.” 

Now let us examine the naive Gauss elimination algorithm (Fig. 9.4) in detail. We will 
first count the flops in the elimination stage. On the first pass through the outer loop, k = 1. 
Therefore, the limits on the inner loop are from i = 2 to n. According to Eq. (9.14d), this 
means that the number of iterations of the inner loop will be 


n 

È l=n-2+1=n-!1 (9.15) 

i=2 

For every one of these iterations, there is one division to calculate the factor. The 
next line then performs a multiplication and a subtraction for each column element from 2 
to nb. Because nb = n + 1, going from 2 to nb results in n multiplications and n subtrac- 
tions. Together with the single division, this amounts to n + 1 multiplications/divisions and 
n addition/subtractions for every iteration of the inner loop. The total for the first pass 
through the outer loop is therefore (n — 1)(n + 1) multiplication/divisions and (n — 1)(n) 
addition/subtractions. 

Similar reasoning can be used to estimate the flops for the subsequent iterations of the 
outer loop. These can be summarized as 


Outer Loop Inner Loop Addition/Subtraction Multiplication/Division 
k i Flops Flops 
1 2,n (n — 1)(n) (n — 1)(n + 1) 
2 3,n (n — 2)(n — 1) (n — 2)(n) 
k k+tn n-kn+1-% n-kn+2-% 
n—1 n,n (1)(2) (1(3) 


Therefore, the total addition/subtraction flops for elimination can be computed as 
n-1 n-1 
È a-ba+1-bH= ¥ [nn + 1)-kQn4+1) +k] (9.16) 
k=1 k=1 
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or 


n-1 n-1 n-1 
nn+1) X 1-2n+) È k+ k (9.17) 
k=l k=l k=l 
Applying some of the relationships from Eq. (9.14) yields 
3 
[nè + O(n] = In + OGP) + [Fn + 00| =% + OM) (9.18) 


A similar analysis for the multiplication/division flops yields 


3 
[n? + O(n)] — [n? + O(n)] + k n+ oo®)| = © + O(n) (9.19) 
Summing these results gives 
3 


Thus, the total number of flops is equal to 2n*/3 plus an additional component pro- 
portional to terms of order n?” and lower. The result is written in this way because as n gets 
large, the O(n”) and lower terms become negligible. We are therefore justified in conclud- 
ing that for large n, the effort involved in forward elimination converges on 2n*/3. 

Because only a single loop is used, back substitution is much simpler to evaluate. The 
number of addition/subtraction flops is equal to n(n — 1)/2. Because of the extra division 
prior to the loop, the number of multiplication/division flops is n(n + 1)/2. These can be 
added to arrive at a total of 


n? + O(n) (9.21) 


Thus, the total effort in naive Gauss elimination can be represented as 


> diy ae aaa 3 
2e +4 O(n’) + n2 J O(n) as n increases an 4 O(n?) co 
e ` 

Forward Back 

elimination substitution 


Two useful general conclusions can be drawn from this analysis: 


1. As the system gets larger, the computation time increases greatly. As in Table 9.1, the 
amount of flops increases nearly three orders of magnitude for every order of magni- 
tude increase in the number of equations. 


TABLE 9.1 Number of flops for naive Gauss elimination. 


Back Total Percent Due 

n Elimination Substitution Flops 2n?/3 to Elimination 
10 705 100 805 667 87.58 
100 671550 10000 681550 666667 98.53 


1000 6.67 x 108 1x 10° 6.68 x 108 6.67 x 108 99.85 
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2. Most of the effort is incurred in the elimination step. Thus, efforts to make the method 
more efficient should probably focus on this step. 


PIVOTING 


The primary reason that the foregoing technique is called “naive” is that during both the 
elimination and the back-substitution phases, it is possible that a division by zero can 
occur. For example, if we use naive Gauss elimination to solve 

2x, + 3x3= 8 

4x, + 6x, + 7x, = —3 

2x, — 3x, + 6x3= 5 
the normalization of the first row would involve division by a; = 0. Problems may also 
arise when the pivot element is close, rather than exactly equal, to zero because if the mag- 
nitude of the pivot element is small compared to the other elements, then round off errors 
can be introduced. 

Therefore, before each row is normalized, it is advantageous to determine the coefficient 
with the largest absolute value in the column below the pivot element. The rows can then be 
switched so that the largest element is the pivot element. This is called partial pivoting. 

If columns as well as rows are searched for the largest element and then switched, the 
procedure is called complete pivoting. Complete pivoting is rarely used because most of 
the improvement comes from partial pivoting. In addition, switching columns changes the 
order of the x’s and, consequently, adds significant and usually unjustified complexity to 
the computer program. 

The following example illustrates the advantages of partial pivoting. Aside from avoid- 
ing division by zero, pivoting also minimizes round off error. As such, it also serves as a 
partial remedy for ill-conditioning. 


Partial Pivoting 


Problem Statement. Use Gauss elimination to solve 
0.0003x, + 3.0000x, = 2.0001 
1.0000x, + 1.0000x, = 1.0000 


Note that in this form the first pivot element, a,, = 0.0003, is very close to zero. Then 
repeat the computation, but partial pivot by reversing the order of the equations. The exact 
solution is x, = 1/3 and x, = 2/3. 


Solution. Multiplying the first equation by 1/(0.0003) yields 
x, + 10,000x, = 6667 

which can be used to eliminate x, from the second equation: 
—9999x, = —6666 


which can be solved for x, = 2/3. This result can be substituted back into the first equation 
to evaluate x: 
_ 2.0001 — 3(2/3) 


x= A (E9.4.1) 
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Due to subtractive cancellation, the result is very sensitive to the number of significant 
figures carried in the computation: 


Absolute Value of 


Significant Percent Relative 
Figures X xy Error for x, 
3 0.667 -3.33 1099 
4 0.6667 0.0000 100 
5 0.66667 0.30000 10 
6 0.666667 0.330000 it 
7 0.6666667 0.3330000 041 


Note how the solution for x, is highly dependent on the number of significant figures. This 
is because in Eq. (E9.4.1), we are subtracting two almost-equal numbers. 

On the other hand, if the equations are solved in reverse order, the row with the larger 
pivot element is normalized. The equations are 


1.0000x, + 1.0000x, = 1.0000 
0.0003x, + 3.0000x, = 2.0001 


Elimination and substitution again yields x, = 2/3. For different numbers of significant 
figures, x, can be computed from the first equation, as in 


1.= (2/3 
gee 


This case is much less sensitive to the number of significant figures in the computation: 


Absolute Value of 


Significant Percent Relative 
Figures xX, x, Error for x, 
3 0.667 0.333 O<1 
4 0.6667 0.3333 0.01 
5 0.66667 0.33333 0.001 
6 0.666667 0.333333 0.0001 
7 0.6666667 0. 3333333 0.0000 


Thus, a pivot strategy is much more satisfactory. 


9.3.1 MATLAB M-file: GaussPivot 


An M-file that implements Gauss elimination with partial pivoting is listed in Fig. 9.5. It 
is identical to the M-file for naive Gauss elimination presented previously in Sec. 9.2.1 
with the exception of the bold portion that implements partial pivoting. 

Notice how the built-in MATLAB function max is used to determine the largest avail- 
able coefficient in the column below the pivot element. The max function has the syntax 


[y, i] = max (x) 
where y is the largest element in the vector x, and i is the index corresponding to that 
element. 
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function x = GaussPivot(A,b) 

% GaussPivot: Gauss elimination pivoting 

% x= GaussPivot(A,b): Gauss elimination with pivoting. 
% input: 

% A = coefficient matrix 

% b= right hand side vector 

% output: 

% x= solution vector 


[m,n]=size(A); 
if m~=n, error('Matrix A must be square'); end 
nb=n+1; 
Aug=[A b]; 
% forward elimination 
fonk = iknzi 
% partial pivoting 
[big,i] = max(abs(Aug(k:n,k))); 
ipr=i+k-1; 
if ipr~ =k 
Aug([k,ipr],:) = Aug([ipr,k],:); 
end 
for i = k+1:n 
factor = Aug(i,k)/Aug(k,k); 
Aug(i,k:nb)=Aug(i,k:nb)-factor*Aug(k,k:nb) ; 
end 
end 
% back substitution 
x=zeros(n,1); 
x(n) =Aug(n,nb)/Aug(n,n); 
for i= inealeiteal 
x(i)=(Aug(i,nb)-Aug(i,i+1:n)*x(i+1:n))/Aug(i,i); 
end 


FIGURE 9.5 
An M-file to implement Gauss elimination with partial pivoting. 


9.3.2 Determinant Evaluation with Gauss Elimination 


At the end of Sec. 9.1.2, we suggested that determinant evaluation by expansion of minors 
was impractical for large sets of equations. However, because the determinant has value in 
assessing system condition, it would be useful to have a practical method for computing 
this quantity. 

Fortunately, Gauss elimination provides a simple way to do this. The method is based 
on the fact that the determinant of a triangular matrix can be simply computed as the product 
of its diagonal elements: 


D = 4449433 ` + * Any 
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9.4 


The validity of this formulation can be illustrated for a 3 x 3 system: 


Gy, Ayn 443 


0 0 ay, 
where the determinant can be evaluated as [recall Eq. (9.1)]: 
= an z3 0 ay 0 an 
B= ey 0 dy, — 4121 o 33 +4310 0 


or, by evaluating the minors: 
D = Gj 199433 — 41,0) + 4)3(0) = 4114733433 


Recall that the forward-elimination step of Gauss elimination results in an upper 
triangular system. Because the value of the determinant is not changed by the forward- 
elimination process, the determinant can be simply evaluated at the end of this step via 


’ n” (n=1) 
22 433 °° ` Ann 


D=a,,a 
where the superscripts signify the number of times that the elements have been modified 
by the elimination process. Thus, we can capitalize on the effort that has already been 
expended in reducing the system to triangular form and, in the bargain, come up with a 
simple estimate of the determinant. 

There is a slight modification to the above approach when the program employs partial 
pivoting. For such cases, the determinant changes sign every time a row is switched. One 
way to represent this is by modifying the determinant calculation as in 


_ ’ mo  Atn-l) (1 )P 
D = 4,4, 433 än (1 


where p represents the number of times that rows are pivoted. This modification can be 
incorporated simply into a program by merely keeping track of the number of pivots that 
take place during the course of the computation. 


TRIDIAGONAL SYSTEMS 


Certain matrices have a particular structure that can be exploited to develop efficient solu- 
tion schemes. For example, a banded matrix is a square matrix that has all elements equal 
to zero, with the exception of a band centered on the main diagonal. 

A tridiagonal system has a bandwidth of 3 and can be expressed generally as 


fi 81 x) r 
ez h &2 X3 n 
e3 f 8&3 X3 r3 

= ` (9.23) 
en-1 Ja- &n-1 Xn-1 Fa-1 


en Ja Xn Fa 
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Notice that we have changed our notation for the coefficients from a’s and b’s to e’s, 
J's, g’S, and r’s. This was done to avoid storing large numbers of useless zeros in the square 
matrix of a’s. This space-saving modification is advantageous because the resulting algo- 
rithm requires less computer memory. 

An algorithm to solve such systems can be directly patterned after Gauss elimination— 
that is, using forward elimination and back substitution. However, because most of the 
matrix elements are already zero, much less effort is expended than for a full matrix. This 
efficiency is illustrated in the following example. 


EXAMPLE 9.5 Solution of a Tridiagonal System 


Problem Statement. Solve the following tridiagonal system: 


204 24 x, 40.8 
=) 204 = x{_}| 08 
-1 2,04 -1 || 0.8 


-1 2.04 |[{x,} (200.8 


Solution. As with Gauss elimination, the first step involves transforming the matrix to 
upper triangular form. This is done by multiplying the first equation by the factor e,/f; 
and subtracting the result from the second equation. This creates a zero in place of e, and 
transforms the other coefficients to new values, 


wpe a OF ns _-l ye 
h=h 7, gı = 2.04 504 (-1) = 1.550 
2 OF aE =a _ 
y= — Fr, =08- (40.8) = 20.8 


fi 2.04 
Notice that g, is unmodified because the element above it in the first row is zero. 
After performing a similar calculation for the third and fourth rows, the system is trans- 
formed to the upper triangular form 


2.04 =] xy 40.8 
1.550 =1 Xx {_ J} 20.8 
1:39 =1 X 14.221 


1.323 | (x,] (210.996 


Now back substitution can be applied to generate the final solution: 


ef- 1323 
_ 137 8X4 _ 14.221 — (—1)159.480 _ 
alr ae 1,395 = 124.538 
_ T= 8X3 _ 20.800 — (—1)124.538 _ 
eae as 1550 = 93.778 
Pree da Sle 40.800 = (=1)93.778 _ 65.970 


F, 2.040 
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function x = Tridiag(e,f,g,r) 
% Tridiag: Tridiagonal equation solver banded system 
% x= Tridiag(e,f,g,r): Tridiagonal system solver. 


% input 
% = subdiagonal vector 
% f = diagonal vector 


superdiagonal vector 
% r= right hand side vector 
% output: 
% x= solution vector 
n=length(f); 
% forward elimination 
for k = 2:n 
factor = e(k)/f(k-1); 
f(k) = f(k) - factor*g(k-1); 
r(k) = r(k) - factor*r(k-1); 
end 
% back substitution 
x(n) = r(n)/f(n); 
ton [et (eile kT 
x(k) = (r(k)—g(k)*x(k+1) )/F(k); 
end 


FIGURE 9.6 
An M-file to solve a tridiagonal system. 


9.4.1 MATLAB N-file: Tridiag 


An M-file that solves a tridiagonal system of equations is listed in Fig. 9.6. Note that the 
algorithm does not include partial pivoting. Although pivoting is sometimes required, most 
tridiagonal systems routinely solved in engineering and science do not require pivoting. 

Recall that the computational effort for Gauss elimination was proportional to n°. 
Because of its sparseness, the effort involved in solving tridiagonal systems is proportional 
to n. Consequently, the algorithm in Fig. 9.6 executes much, much faster than Gauss elimi- 
nation, particularly for large systems. 


Rw oy. \-] S90] ph MODEL OF A HEATED ROD 


Background. Linear algebraic equations can arise when modeling distributed systems. 
For example, Fig. 9.7 shows a long, thin rod positioned between two walls that are held at 
constant temperatures. Heat flows through the rod as well as between the rod and the sur- 
rounding air. For the steady-state case, a differential equation based on heat conservation 
can be written for such a system as 

LT +h'(T,-T)=0 (9.24) 
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9.5 CASE STUDY continued 


FIGURE 9.7 
A noninsulated uniform rod positioned between two walls of constant but different temperature. 
The finite-difference representation employs four interior nodes. 


where T = temperature (°C), x = distance along the rod (m), h’ = a heat transfer coefficient 
between the rod and the surrounding air (m~’), and T,= the air temperature (°C). 

Given values for the parameters, forcing functions, and boundary conditions, calculus 
can be used to develop an analytical solution. For example, if h’ = 0.01, T, = 20, T(0) = 40, 
and 7(10) = 200, the solution is 


P= 1 BESDIO = Saye OO) (9.25) 


Although it provided a solution here, calculus does not work for all such problems. In 
such instances, numerical methods provide a valuable alternative. In this case study, we 
will use finite differences to transform this differential equation into a tridiagonal system 
of linear algebraic equations which can be readily solved using the numerical methods 
described in this chapter. 


Solution. Equation (9.24) can be transformed into a set of linear algebraic equations by 
conceptualizing the rod as consisting of a series of nodes. For example, the rod in Fig. 9.7 
is divided into six equispaced nodes. Since the rod has a length of 10, the spacing between 
nodes is Ax = 2. 

Calculus was necessary to solve Eq. (9.24) because it includes a second derivative. 
As we learned in Sec. 4.3.4, finite-difference approximations provide a means to transform 
derivatives into algebraic form. For example, the second derivative at each node can be 
approximated as 


dpn Ti — 21, + Ti 
dx? Ax? 


where 7, designates the temperature at node i. This approximation can be substituted into 
Eq. (9.24) to give 
Ls 27, tT 
A 


+h'(T,-T)=0 
Ax (T, ) 
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9.5 CASE STUDY continued 


Collecting terms and substituting the parameters gives 


=a 204i T 1018 (9.26) 


Thus, Eq. (9.24) has been transformed from a differential equation into an algebraic equa- 
tion. Equation (9.26) can now be applied to each of the interior nodes: 


=m + 2.04T, = T, =0.8 
=f, 42047, — T= 08 
=f, +2047, — 7,08 
=f, £2047, 7, =08 


(9.27) 


The values of the fixed end temperatures, Tọ = 40 and T; = 200, can be substituted and 
moved to the right-hand side. The results are four equations with four unknowns expressed 
in matrix form as 


WO A O o TA 40.8 

-1 2% -1 0 |JT|_] 08 (08) 
0 = SOE =l ||, 0.8 
0 o = 2040) jen | 20018 


So our original differential equation has been converted into an equivalent system of 
linear algebraic equations. Consequently, we can use the techniques described in this chap- 
ter to solve for the temperatures. For example, using MATLAB 


>> A=[2.04 -100 

=i 2.0! =i (0) 

(0) <a oy! al 

0) (0) i 

>> b=[40.8 0.8 0.8 200.8]'; 
>> T=(A\b)' 


y= 
65.9698 93.7785 124.5382 159.4795 


A plot can also be developed comparing these results with the analytical solution obtained 
with Eq. (9.25), 


>> T=[40 T 200]; 

>> x=[0:2:10]; 

>> xanal=[0:10]; 

>> TT=@(x) 73.4523*exp(0.1*x)-53.4523* ... 
exp(=0).1*x)+20;; 

>> Tanal=TT(xanal); 

>> plot(x,T,'o',xanal,Tanal) 


As in Fig. 9.8, the numerical results are quite close to those obtained with calculus. 
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9.5 CASE STUDY continued 


220 


Analytical (line) and numerical (points) solutions 
T T T T T T T T T 
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20 i i | | | i l | i 
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FIGURE 9.8 
A plot of temperature versus distance along a heated rod. Both analytical (line) and numerical 
(points) solutions are displayed. 


In addition to being a linear system, notice that Eq. (9.28) is also tridiagonal. We can 
use an efficient solution scheme like the M-file in Fig. 9.6 to obtain the solution: 


>> e= (0) a —1 E 

>> f=[2.04 2.04 2.04 2.04]; 
== @=|f= il =ail (lle 

>> r=[40.8 0.8 0.8 200.8]; 
>> Tridiag(e,f,g,r) 

ans = 


65.9698 93.7785 124.5382 159.4795 


The system is tridiagonal because each node depends only on its adjacent nodes. 
Because we numbered the nodes sequentially, the resulting equations are tridiagonal. Such 
cases often occur when solving differential equations based on conservation laws. 
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PROBLEMS 


9.1 Determine the number of total flops as a function of the 
number of equations n for the tridiagonal algorithm (Fig. 9.6). 
9.2 Use the graphical method to solve 


4x, — 8x, = —24 
x +6x,= 34 
Check your results by substituting them back into the equations. 
9.3 Given the system of equations 
—1.1x, + 10x, = 120 
—2x, + 17.4x, = 174 
(a) Solve graphically and check your results by substituting 
them back into the equations. 
(b) On the basis of the graphical solution, what do you 
expect regarding the condition of the system? 


(c) Compute the determinant. 
9.4 Given the system of equations 


—3x, + 7x, = 4 
xX, + 2x, — x; =0 
5x,-2x, =3 


(a) Compute the determinant. 

(b) Use Cramer’s rule to solve for the x’s. 

(c) Use Gauss elimination with partial pivoting to solve for 
the x’s. As part of the computation, calculate the deter- 
minant in order to verify the value computed in (a). 

(d) Substitute your results back into the original equations 
to check your solution. 

9.5 Given the equations 


0.5x,- x, =—-9.5 
1.02x, — 2x, = —18.8 


(a) Solve graphically. 

(b) Compute the determinant. 

(c) On the basis of (a) and (b), what would you expect 
regarding the system’s condition? 

(d) Solve by the elimination of unknowns. 

(e) Solve again, but with a,, modified slightly to 0.52. 
Interpret your results. 

9.6 Given the equations 


10x, + 2x,—- x= 27 
—3x, — 5x, + 2x; = —61.5 
xı +X, + 6x = -21.5 
(a) Solve by naive Gauss elimination. Show all steps of the 
computation. 


(b) Substitute your results into the original equations to 
check your answers. 


9.7 Given the equations 
2x, — 6x, — x3 = —38 
—3x — x, + 7x; = —34 
—8x, + xX, — 2x, = —20 
(a) Solve by Gauss elimination with partial pivoting. As part 
of the computation, use the diagonal elements to calcu- 
late the determinant. Show all steps of the computation. 
(b) Substitute your results into the original equations to 
check your answers. 
9.8 Perform the same calculations as in Example 9.5, but for 
the tridiagonal system: 


0.8 -04 x, 41 
-0.4 08 -0.4]4x,}= 425 
-0.4 O08 tla} [105 


9.9 Figure P9.9 shows three reactors linked by pipes. As 
indicated, the rate of transfer of chemicals through each 
pipe is equal to a flow rate (Q, with units of cubic meters 
per second) multiplied by the concentration of the reactor 
from which the flow originates (c, with units of milligrams 
per cubic meter). If the system is at a steady state, the transfer 
into each reactor will balance the transfer out. Develop mass- 
balance equations for the reactors and solve the three simul- 
taneous linear algebraic equations for their concentrations. 

9.10 A civil engineer involved in construction requires 
4800, 5800, and 5700 m° of sand, fine gravel, and coarse 
gravel, respectively, for a building project. There are three 


FIGURE P9.9 

Three reactors linked by pipes. The rate of mass trans- 
fer through each pipe is equal to the product of flow Q 
and concentration c of the reactor from which the flow 
originates. 
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pits from which these materials can be obtained. The com- 
position of these pits is 


Sand Fine Gravel Coarse Gravel 
% % % 
Pitt 55 30 15 
Pit2 25 45 30 
Pit3 25 20 55 


How many cubic meters must be hauled from each pit in 
order to meet the engineer’s needs? 

9.11 An electrical engineer supervises the production of 
three types of electrical components. Three kinds of mate- 
rial—metal, plastic, and rubber—are required for produc- 
tion. The amounts needed to produce each component are 


Metal (g/ Plastic (g/ Rubber (g/ 

Component component) component) component) 
1 15 0.30 120 
2 17 0.40 1.2 
3 19 0.55 1.5 


If totals of 3.89, 0.095, and 0.282 kg of metal, plastic, and 
rubber, respectively, are available each day, how many com- 
ponents can be produced per day? 

9.12 As described in Sec. 9.4, linear algebraic equations can 
arise in the solution of differential equations. For example, 
the following differential equation results from a steady-state 
mass balance for a chemical in a one-dimensional canal: 


2 
0=pD4£ -U£ -kc 
axr dx 
< Flow =F, 
Xout X% 
Yin yı 


X aii Xj 41 Xn -1 Xn 
—== —= 
ry) coo n=1 n 
i 
wa Diet Ji Yna Yn -1 


where c = concentration, t = time, x = distance, D = dif- 
fusion coefficient, U = fluid velocity, and k = a first-order 
decay rate. Convert this differential equation to an equiva- 
lent system of simultaneous algebraic equations. Given 
D =2, U=1,k=0.2, c(0) = 80 and c(10) = 20, solve these 
equations from x = 0 to 10 and develop a plot of concentra- 
tion versus distance. 
9.13 A stage extraction process is depicted in Fig. P9.13. 
In such systems, a stream containing a weight fraction y,, 
of a chemical enters from the left at a mass flow rate of F}. 
Simultaneously, a solvent carrying a weight fraction x,, of 
the same chemical enters from the right at a flow rate of F,. 
Thus, for stage i, a mass balance can be represented as 
Puy + Faxa = Fiyi + Fox} (P9.13a) 
At each stage, an equilibrium is assumed to be established 
between y; and x; as in 
x; 


Ka 


: (P9.13b) 


where K is called a distribution coefficient. Equation (P9.13b) 
can be solved for x; and substituted into Eq. (P9.13a) to yield 


vam (145K) (FR) ns 20 (P9.13c) 
If Fi = 500 kg/h, y = 0.1, F, = 1000 kg/h, x; = 0, and 
K = 4, determine the values of y,,, and x,u if a five-stage 
reactor is used. Note that Eq. (P9.13c) must be modified to 
account for the inflow weight fractions when applied to the 
first and last stages. 

9.14 A peristaltic pump delivers a unit flow (Q,) of a highly 
viscous fluid. The network is depicted in Fig. P9.14. Every 
pipe section has the same length and diameter. The mass and 
mechanical energy balance can be simplified to obtain the 


Xin 
= 


Yout 


= Flow = F => 


FIGURE P9.13 
A stage extraction process. 
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FIGURE P9.14 


flows in every pipe. Solve the following system of equations 
to obtain the flow in every stream. 


Q,+ 20, - 20, =0 Q, =Q, +Q; 
Qs + 2Q, — 20, =0 Q; = Q, + Q; 
30, -20,=0 Q; =Q; +0; 


9.15 A truss is loaded as shown in Fig. P9.15. Using the 
following set of equations, solve for the 10 unknowns, AB, 
BC, AD, BD, CD, DE, CE, A,, Ay, and E,. 


A,+AD=0 
A, + AB=0 

74 + BC + (3/5)BD =0 
-AB — (4/5)BD = 0 
-BC + (3/5)CE =0 


—24 — CD — (4/5)CE = 0 
—AD + DE — (3/5)BD = 0 
CD + (4/5)BD = 0 

—DE — (3/5)CE = 0 

E, + (4/5)CE = 0 


24 kN 
p | 
74 kN C 
4m 
A E 
who, 3m D 3m 


FIGURE P9.15 


9.16 A pentadiagonal system with a bandwidth of five can 
be expressed generally as 


e h &2 h, 
d; e3 f 8&3 h; 


n-1 Fi 1 Sn-1 
d, En Ía 


X Ti 
xX f3 
X3 Is 
x = 
Xn-1 Fa-1 
X f 


Develop an M-file to efficiently solve such systems without 
pivoting in a similar fashion to the algorithm used for tridi- 
agonal matrices in Sec. 9.4.1. Test it for the following case: 


8 -2 -1 0 ojj% 5 
-2 9 -4 -1 0 ||% 2 
-1 -3 7 -1 -2|7)%3f= 71 

0 -4 -2 12 -5ļ||% 1 

0 0 -7 -3 -15 |[%s 2 


9.17 Develop an M-file function based on Fig. 9.5 to 
implement Gauss elimination with partial pivoting. Modify 
the function so that it computes and returns the determinant 
(with the correct sign), and detects whether the system is 
singular based on a near-zero determinant. For the latter, 
define “near-zero” as being when the absolute value of the 
determinant is below a tolerance. When this occurs, design 
the function so that an error message is displayed and the 
function terminates. Here is the functions first line: 


function [x, D] = GaussPivotNew(A, b, tol) 
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where D = the determinant and tol = the tolerance. Test your 
program for Prob. 9.5 with tol = 1x 107. 

9.18 As described in Sec. 9.5, linear algebraic equations can 
arise in the solution of differential equations. The following 
differential equation results from a steady-state mass bal- 
ance for a chemical in a one-dimensional canal, 


o=pte—~ ye -ke 
dx? dx 

where x = distance along the canal (m), c = concentration, 

t = time, x = distance, D = diffusion coefficient, U = fluid 

velocity, and k = a first-order decay rate. 

(a) Convert this differential equation to an equivalent sys- 
tem of simultaneous algebraic equations using centered 
difference approximations for the derivatives. 

(b) Develop a function to solve these equations from x = 0 
to L and return the resulting distances and concentra- 
tions. The first line of your function should be 


function [x,c]=YourLastName_reactor(D, U, k, 
c0, cL, L, dx) 


(c) Develop a script that invokes this function and then 
plots the results. 

(d) Test your script for the following parameters: L = 10 m, 
Ax = 0.5 m, D = 2 m’”/d, U = 1 m/d, k = 0.2/d, c(0) = 
80 mg/L, and c(10) = 20 mg/L. 

9.19 The following differential equation results from a 

force balance for a beam with a uniform loading, 


where x = distance along the beam (m), y = deflection (m), 

L = length (m), E = modulus of elasticity (N/m’), J = 

moment of inertia (m*), and w = uniform load (N/m). 

(a) Convert this differential equation to an equivalent system 
of simultaneous algebraic equations using a centered- 
difference approximation for the second derivative. 


(b) Develop a function to solve these equations from x = 0 
to L and return the resulting distances and deflections. 
The first line of your function should be 


function [x, y]=YourLastName_beam(E, I, w, y0, 
yL, L, dx) 


(c) Develop a script that invokes this function and then 
plots the results. 

(d) Test your script for the following parameters: L = 3 m, 
Ax = 0.2 m, E = 250 x 10° N/m’, I = 3 x 1074 m4, w = 
22,500 N/m, y(0) = 0, and y(3) = 0. 

9.20 Heat is conducted along a metal rod positioned be- 
tween two fixed temperature walls. Aside from conduction, 
heat is transferred between the rod and the surrounding air 
by convection. Based on a heat balance, the distribution of 
temperature along the rod is described by the following sec- 
ond-order differential equation 


2 

gs2 ganm D 

where T = temperature (K), h’ = a bulk heat transfer co- 

efficient reflecting the relative importance of convection to 
conduction (m~*), x = distance along the rod (m), and T,, = 

temperature of the surrounding fluid (K). 

(a) Convert this differential equation to an equivalent sys- 
tem of simultaneous algebraic equations using a centered 
difference approximation for the second derivative. 

(b) Develop a function to solve these equations from x = 0 
to L and return the resulting distances and temperatures. 
The first line of your function should be 


function [x, y]=YourLastName_rod(hp, Tinf, TO, 
TL, L, dx) 


(c) Develop a script that invokes this function and then 
plots the results. 

(d) Test your script for the following parameters: h’ = 
0.0425 m”, L = 12 m, T,, = 220 K, T() = 320 K, 
T(L) = 450 K, and Ax = 0.5 m. 
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CHAPTER OBJECTIVES 


The primary objective of this chapter is to acquaint you with LU factorization.' 
Specific objectives and topics covered are 


Understanding that LU factorization involves decomposing the coefficient matrix 
into two triangular matrices that can then be used to efficiently evaluate different 
right-hand-side vectors. 

Knowing how to express Gauss elimination as an LU factorization. 


Given an LU factorization, knowing how to evaluate multiple right-hand-side 
vectors. 

Recognizing that Cholesky’s method provides an efficient way to decompose a 
symmetric matrix and that the resulting triangular matrix and its transpose can be 
used to evaluate right-hand-side vectors efficiently. 

Understanding in general terms what happens when MATLAB’s backslash 
operator is used to solve linear systems. 


s described in Chap. 9, Gauss elimination is designed to solve systems of linear 
algebraic equations: 


[A] {x} = {b} (10.1) 


Although it certainly represents a sound way to solve such systems, it becomes inefficient 
when solving equations with the same coefficients [A], but with different right-hand-side 
constants {b}. 


1 In the parlance of numerical methods, the terms “factorization” and “decomposition” are synonymous. To be 
consistent with the MATLAB documentation, we have chosen to employ the terminology LU factorization for 
the subject of this chapter. Note that LU decomposition is very commonly used to describe the same approach. 
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10.1 


Recall that Gauss elimination involves two steps: forward elimination and back substi- 
tution (Fig. 9.3). As we learned in Sec. 9.2.2, the forward-elimination step comprises the 
bulk of the computational effort. This is particularly true for large systems of equations. 

LU factorization methods separate the time-consuming elimination of the matrix [A] 
from the manipulations of the right-hand side {b}. Thus, once [A] has been “factored” or 
“decomposed,” multiple right-hand-side vectors can be evaluated in an efficient manner. 

Interestingly, Gauss elimination itself can be expressed as an LU factorization. 
Before showing how this can be done, let us first provide a mathematical overview of the 
factorization strategy. 


OVERVIEW OF LU FACTORIZATION 


Just as was the case with Gauss elimination, LU factorization requires pivoting to avoid 
division by zero. However, to simplify the following description, we will omit pivoting. In 
addition, the following explanation is limited to a set of three simultaneous equations. The 
results can be directly extended to n-dimensional systems. 

Equation (10.1) can be rearranged to give 


[A]{x} — {b} =0 (10.2) 
Suppose that Eq. (10.2) could be expressed as an upper triangular system. For example, 
for a 3 X 3 system: 

Wy, Ui Mig Xi d, 

0O un Uys | top = 4d (10.3) 
0 0 Us; X3 d, 
Recognize that this is similar to the manipulation that occurs in the first step of Gauss 


elimination. That is, elimination is used to reduce the system to upper triangular form. 
Equation (10.3) can also be expressed in matrix notation and rearranged to give 


[U]{x} — {d} =0 (10.4) 
Now assume that there is a lower diagonal matrix with 1’s on the diagonal, 
1 0 0 
[L]=| 4, 1 0 (10.5) 
hi d 1 


that has the property that when Eq. (10.4) is premultiplied by it, Eq. (10.2) is the result. 
That is, 


ILHLU]Hx} — {d}} = [A]{x} — {b} (10.6) 
If this equation holds, it follows from the rules for matrix multiplication that 

[L][U] = [A] (10.7) 
and 


[L]{d} = {b} (10.8) 
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10.2 


[A] {x} = (b} 
(a) Factorization VAN 
[U] H 


[L] {d}={b} ) 
—— l 

| > (b) Forward 

{d} 

| > Substitution 

[U] {x} = {a} 
— 


| > (c) Back 


{x} J J 


FIGURE 10.1 
The steps in LU factorization. 


A two-step strategy (see Fig. 10.1) for obtaining solutions can be based on Eqs. (10.3), 
(10.7), and (10.8): 


1. LU factorization step. [A] is factored or “decomposed” into lower [L] and upper [U] 
triangular matrices. 

2. Substitution step. [L] and [U] are used to determine a solution {x} for a right-hand 
side {b}. This step itself consists of two steps. First, Eq. (10.8) is used to generate an 
intermediate vector {d} by forward substitution. Then, the result is substituted into 
Eq. (10.3) which can be solved by back substitution for {x}. 


Now let us show how Gauss elimination can be implemented in this way. 


GAUSS ELIMINATION AS LU FACTORIZATION 


Although it might appear at face value to be unrelated to LU factorization, Gauss elimina- 
tion can be used to decompose [A] into [L] and [U]. This can be easily seen for [U], which 
is a direct product of the forward elimination. Recall that the forward-elimination step is 
intended to reduce the original coefficient matrix [A] to the form 


âi Ayn 443 
[U]J=| 0 ay a, (10.9) 
0 0 a 


which is in the desired upper triangular format. 
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Though it might not be as apparent, the matrix [L] is also produced during the step. 
This can be readily illustrated for a three-equation system, 


Gy, Ayn 443 xy b, 
ai an 3 X} ={ dy 
43; 32 33 X3 b; 


The first step in Gauss elimination is to multiply row 1 by the factor [recall Eq. (9.9)] 


_ în 

fu = qi 
and subtract the result from the second row to eliminate a,,. Similarly, row 1 is multiplied by 

_ 431 

fa = ay) 


and the result subtracted from the third row to eliminate a,,. The final step is to multiply 
the modified second row by 


t 
fo= a32 
32. > at 

an 


and subtract the result from the third row to eliminate a%,. 

Now suppose that we merely perform all these manipulations on the matrix [A]. 
Clearly, if we do not want to change the equations, we also have to do the same to the right- 
hand side {b}. But there is absolutely no reason that we have to perform the manipulations 
simultaneously. Thus, we could save the f’s and manipulate {b} later. 

Where do we store the factors f,,, f;,, and f3? Recall that the whole idea behind the 
elimination was to create zeros in d,,, 431, and a}. Thus, we can store fọ; in d5,, f3; IN a3), 
and fz, in a}. After elimination, the [A] matrix can therefore be written as 


âi Ay 443 


fi an az (10.10) 
fi fz a3 
This matrix, in fact, represents an efficient storage of the LU factorization of [A], 
[A] > [L][U] (10.11) 
where 


âi Anz 443 


[U]=| 0 ay ap (10.12) 
0 0 a, 
and 
1 0 0 
eS). a 1 0 (10.13) 
hi f2 1 


The following example confirms that [A] = [L][U]. 
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EXAMPLE 10.1 


LU Factorization with Gauss Elimination 


Problem Statement. Derive an LU factorization based on the Gauss elimination per- 
formed previously in Example 9.3. 


Solution. In Example 9.3, we used Gauss elimination to solve a set of linear algebraic 
equations that had the following coefficient matrix: 


3 Gh <9 
[A]=|0.1 7 -03 
03 -02 10 


After forward elimination, the following upper triangular matrix was obtained: 


3 —0.1 —0.2 
[U]=| 0 7.00333 —0.293333 
0 0 10.0120 


The factors employed to obtain the upper triangular matrix can be assembled into a lower 
triangular matrix. The elements a,, and a}; were eliminated by using the factors 


faz ot = 0.0333333 faz 83 = 0.1000000 
and the element a,, was eliminated by using the factor 
— 019 _ 
fa = 700333 ~ 0.0271300 
Thus, the lower triangular matrix is 
1 0 0 
[L] = | 0.0333333 1 0 


0.100000 —0.0271300 1 


Consequently, the LU factorization is 


1 0 0 3 —0.1 —0.2 
[A] = [L][U] = | 0.0333333 1 0 0 7.00333 —0.293333 
0.100000 -—0.0271300 1 0 0 10.0120 


This result can be verified by performing the multiplication of [L][U] to give 


3 —0.1 —0.2 
[L][U] = | 0.0999999 7 —0.3 
0.3 —0.2 9.99996 


where the minor discrepancies are due to roundoff. 


After the matrix is decomposed, a solution can be generated for a particular right-hand- 
side vector {b}. This is done in two steps. First, a forward-substitution step is executed by 
solving Eq. (10.8) for {d}. It is important to recognize that this merely amounts to perform- 
ing the elimination manipulations on {b}. Thus, at the end of this step, the right-hand side 
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EXAMPLE 10.2 


will be in the same state that it would have been had we performed forward manipulation 
on [A] and {b} simultaneously. 
The forward-substitution step can be represented concisely as 


H1 
dj=b,- È} ld, fori=1,2,...,n 
jel 


The second step then merely amounts to implementing back substitution to solve 
Eq. (10.3). Again, it is important to recognize that this is identical to the back-substitution 
phase of conventional Gauss elimination [compare with Eqs. (9.12) and (9.13)]: 


Xn z d,,/[Uny 


ga Ee fori=n—1,n—-2,...,1 


The Substitution Steps 


Problem Statement. Complete the problem initiated in Example 10.1 by generating the 
final solution with forward and back substitution. 


Solution. As just stated, the intent of forward substitution is to impose the elimination 
manipulations that we had formerly applied to [A] on the right-hand-side vector {b}. Recall 
that the system being solved is 


3 -0.1 —0.2| (*1 7.85 
0.1 7  —0.3|< %2 }=4 -19.3 
0.3 —0.2 10 X3 71.4 


and that the forward-elimination phase of conventional Gauss elimination resulted in 


3 -0.1 —0.2 Ki 7.85 
0 7.00333 —0.293333] < %2 $ =< —19.5617 
0 0 10.0120 X3 70.0843 


The forward-substitution phase is implemented by applying Eq. (10.8): 


1 0 0] (4 7.85 
0.0333333 1 of Jd,b=2 -193 
0.100000 -0.0271300 1] fa, 71.4 


or multiplying out the left-hand side: 


d, = 7.85 
0.0333333d, + d,  =-193 
0.100000d, — 0.0271300d, + d= 71.4 


We can solve the first equation for d, = 7.85, which can be substituted into the second 
equation to solve for 


d, = —19.3 — 0.0333333(7.85) = —19.5617 
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Both d, and d, can be substituted into the third equation to give 


d} = 71.4 — 0.1(7.85) + 0.02713(—19.5617) = 70.0843 


Thus, 


7.85 
{d} =4 -19.5617 
70.0843 


This result can then be substituted into Eq. (10.3), [U Hx} = {d}: 


3 -0.1 —0.2 ži 7.85 
0 7.00333 —0.293333] < %2 $ =< -19.5617 
0 0 10.0120 X3 70.0843 


which can be solved by back substitution (see Example 9.3 for details) for the final solution: 


3 
w-d -2.5 \ 
7.00003 


10.2.1 LU Factorization with Pivoting 


Just as for standard Gauss elimination, partial pivoting is necessary to obtain reliable solu- 
tions with LU factorization. One way to do this involves using a permutation matrix (recall 
Sec. 8.1.2). The approach consists of the following steps: 


1. 


Elimination. The LU factorization with pivoting of a matrix [A] can be represented in 
matrix form as 


The upper triangular matrix, [U], is generated by elimination with partial pivoting, 
while storing the multiplier factors in [L] and employing the permutation matrix, [P], 
to keep track of the row switches. 

Forward substitution. The matrices [L] and [P] are used to perform the elimination 
step with pivoting on {b} in order to generate the intermediate right-hand-side vector, 
{d}. This step can be represented concisely as the solution of the following matrix 
formulation: 


[L]{d} = [P]{b} 


Back substitution. The final solution is generated in the same fashion as done previ- 
ously for Gauss elimination. This step can also be represented concisely as the solution 
of the matrix formulation: 


[U]{x} = {d} 


The approach is illustrated in the following example. 
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EXAMPLE 10.3 


LU Factorization with Pivoting 
Problem Statement. Compute the LU factorization and find the solution for the same 


system analyzed in Example 9.4 


a 3.0000 | {ate Gee, 
1.0000 1.0000 | | *2 1.0000 


Solution. Before elimination, we set up the initial permutation matrix: 


[P] = bated se 
0.0000 1.0000 


We immediately see that pivoting is necessary, so prior to elimination we switch the rows: 


[4] = per a) 
0.0003 3.0000 


At the same time, we keep track of the pivot by switching the rows of the permutation 
matrix: 


[P] = pee od 


1.0000 0.0000 


We then eliminate a,, by subtracting the factor l, = a,,;/a,, = 0.0003/1 = 0.0003 from 
the second row of A. In so doing, we compute that the new value of a}, = 3 — 0.0003(1) = 
2.9997. Thus, the elimination step is complete with the result: 


w= [4 or (41 = | 9.0003 i 


Before implementing forward substitution, the permutation matrix is used to reorder 
the right-hand-side vector to reflect the pivots as in 


0.0000 A eee 1 \ 
1.0000 0.0000 1 2.0001 


Then, forward substitution is applied as in 


locos 11a} = {2.0001} 
0.0003 1] la, f 7 12.0001 


which can be solved for d, = 1 and d, = 2.0001 — 0.0003(1) = 1.9998. At this point, the 
system is 


lo 29907] La} ={ 1.9508} 
0 2.9997] Lx f ~ 11.9998 


Applying back substitution gives the final result: 


x = 1.9998 
2 = 2.9997 


Pb) = | 


= 0.66667 


1 — 1(0.66667) 
x= — a 


= 0.33333 
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EXAMPLE 10.4 


The LU factorization algorithm requires the same total flops as for Gauss elimination. 
The only difference is that a little less effort is expended in the factorization phase since the 
operations are not applied to the right-hand side. Conversely, the substitution phase takes 
a little more effort. 


10.2.2 MATLAB Function: lu 


MATLAB has a built-in function lu that generates the LU factorization. It has the general 
syntax: 


[L,U] = lu(X) 


where L and U are the lower triangular and upper triangular matrices, respectively, derived 
from the LU factorization of the matrix X. Note that this function uses partial pivoting to 
avoid division by zero. The following example shows how it can be employed to generate 
both the factorization and a solution for the same problem that was solved in Examples 10.1 
and 10.2. 

LU Factorization with MATLAB 


Problem Statement. Use MATLAB to compute the LU factorization and find the solution 
for the same linear system analyzed in Examples 10.1 and 10.2: 


3 -0.1 -0.2] (*1 7.85 
01l 7 0.3} < %2 += 4 -193 
0.3 -0.2 10 X3 71.4 


Solution. The coefficient matrix and the right-hand-side vector can be entered in standard 
fashion as 


>> A= [3 -.1 -.2;.1 7 -.3;.3 -.2 10]; 
>> b = [7.85; -19.3; 71.4]; 


Next, the LU factorization can be computed with 


>> [L,U] = 1u(A) 


LZ 
1.0000 0 0 
0.0333 1.0000 0 
0.1000 -0.0271 1.0000 
U= 


3.0000 -0.1000 -0.2000 
0 7.0033 -0.2933 
0 0 10.0120 


This is the same result that we obtained by hand in Example 10.1. We can test that it is 
correct by computing the original matrix as 


>> L*U 
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ans = 


3.0000 -0.1000 -0.2000 
0.1000 7.0000 -0.3000 
0.3000 -0.2000 10.0000 


To generate the solution, we first compute 
>> d = L\b 


d= 
7.8500 
-19.5617 
70.0843 


And then use this result to compute the solution 


>> x = U\d 

x= 
3.0000 
-2.5000 
7.0000 


These results conform to those obtained by hand in Example 10.2. 


10.3 


CHOLESKY FACTORIZATION 


Recall from Chap. 8 that a symmetric matrix is one where a;; = a; for all i and j. In other 
words, [A] = [A]’. Such systems occur commonly in both mathematical and engineering/ 
science problem contexts. 

Special solution techniques are available for such systems. They offer computational 
advantages because only half the storage is needed and only half the computation time is 
required for their solution. 

One of the most popular approaches involves Cholesky factorization (also called Cho- 
lesky decomposition). This algorithm is based on the fact that a symmetric matrix can be 
decomposed, as in 


[A] = [U ]" [U] (10.14) 


That is, the resulting triangular factors are the transpose of each other. 
The terms of Eq. (10.14) can be multiplied out and set equal to each other. The factor- 
ization can be generated efficiently by recurrence relations. For the ith row: 


i-l 
u;=4]a;- Ly (10.15) 
k=1 


u;=— == forfj=itl,...,n (10.16) 
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EXAMPLE 10.5 Cholesky Factorization 
Problem Statement. Compute the Cholesky factorization for the symmetric matrix 
6 15 55 
[A]= |15 55 225 
55. 225 979 
Solution. For the first row (i = 1), Eq. (10.15) is employed to compute 
Uy, = Vay, = V6 = 2.44949 
Then, Eq. (10.16) can be used to determine 


Hau panos A 


a=, OO a 
Hie tay = Baggage 


For the second row (i = 2): 


Uyy = yan — u, = 55 — (6.123724)? = 4.1833 


Ay3 — Uy), _ 225 — 6.123724(22.45366) 
Uy) ~ 4.1833 


For the third row (i = 3): 


uz, = Vaz — Wy — us, = V979 — (22.45366)? — (20.9165)? = 6.110101 


Thus, the Cholesky factorization yields 


= 20.9165 


Uy3 


2.44949 6.123724 22.45366 
[U] = 4.1833 20.9165 
6.110101 


The validity of this factorization can be verified by substituting it and its transpose 
into Eq. (10.14) to see if their product yields the original matrix [A]. This is left for an 
exercise. 


After obtaining the factorization, it can be used to determine a solution for a right- 
hand-side vector {b} in a manner similar to LU factorization. First, an intermediate vector 
{d} is created by solving 


[U]" {d} = {b} (10.17) 
Then, the final solution can be obtained by solving 


[U]{x} = {d} (10.18) 
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EXAMPLE 10.6 


10.3.1 MATLAB Function: chol 


MATLAB has a built-in function chol that generates the Cholesky factorization. It has the 
general syntax, 


U = chol(X) 
where Uis an upper triangular matrix so that U'*U= X. The following example shows how it 


can be employed to generate both the factorization and a solution for the same matrix that 
we looked at in the previous example. 


Cholesky Factorization with MATLAB 


Problem Statement. Use MATLAB to compute the Cholesky factorization for the same 
matrix we analyzed in Example 10.5. 


6 15 55 
[AJ=]15 55 225 
55 225-979 


Also obtain a solution for a right-hand-side vector that is the sum of the rows of [A]. Note 
that for this case, the answer will be a vector of ones. 


Solution. The matrix is entered in standard fashion as 
>> A = [6 15 55; 15 55 225; 55 225 979]; 


A right-hand-side vector that is the sum of the rows of [A] can be generated as 
>> b = [sum(A(1,:)); sum(A(2,:)); sum(A(3,:))] 


b= 
76 
295 
1259 


Next, the Cholesky factorization can be computed with 
>> U = chol(A) 


Vs 
2.4495 6.1237 22.4537 
0 4.1833 20.9165 
0 0 6.1101 


We can test that this is correct by computing the original matrix as 


>> U'*U 


ans = 
6.0000 15.0000 55.0000 
15.0000 55.0000 225.0000 
55.0000 225.0000 979.0000 
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To generate the solution, we first compute 
>> d = U'\b 

d= 
31.0269 


25.0998 
6.1101 


And then use this result to compute the solution 


>> x = U\d 

x= 
1.0000 
1.0000 
1.0000 


10.4 


MATLAB LEFT DIVISION 


We previously introduced left division without any explanation of how it works. Now 
that we have some background on matrix solution techniques, we can provide a simplified 
description of its operation. 

When we implement left division with the backslash operator, MATLAB invokes a 
highly sophisticated algorithm to obtain a solution. In essence, MATLAB examines the 
structure of the coefficient matrix and then implements an optimal method to obtain the 
solution. Although the details of the algorithm are beyond our scope, a simplified overview 
can be outlined. 

First, MATLAB checks to see whether [A] is in a format where a solution can be 
obtained without full Gauss elimination. These include systems that are (a) sparse and 
banded, (b) triangular (or easily transformed into triangular form), or (c) symmetric. If 
any of these cases are detected, the solution is obtained with the efficient techniques that 
are available for such systems. Some of the techniques include banded solvers, back and 
forward substitution, and Cholesky factorization. 

If none of these simplified solutions are possible and the matrix is square,” a general 
triangular factorization is computed by Gauss elimination with partial pivoting and the 
solution obtained with substitution. 


? It should be noted that in the event that [A] is not square, a least-squares solution is obtained with an approach 
called QR factorization. 


PROBLEMS 
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PROBLEMS 


10.1 Determine the total flops as a function of the number 
of equations n for the (a) factorization, (b) forward substitu- 
tion, and (c) back-substitution phases of the LU factorization 
version of Gauss elimination. 

10.2 Use the rules of matrix multiplication to prove that 
Eqs. (10.7) and (10.8) follow from Eq. (10.6). 

10.3 Use naive Gauss elimination to factor the following 
system according to the description in Sec. 10.2: 


10x, + 2x,- x= 27 
—3x, — 6x, + 2x, = —61.5 
x+ xX% + 5x = —21.5 


Then, multiply the resulting [L] and [U] matrices to deter- 
mine that [A] is produced. 

10.4 (a) Use LU factorization to solve the system of equa- 
tions in Prob. 10.3. Show all the steps in the computation. 
(b) Also solve the system for an alternative right-hand-side 
vector 


{b}"=]|12 18 -6| 


10.5 Solve the following system of equations using LU 
factorization with partial pivoting: 


2x, = 6X, — x, = —38 
—3x,— x + 7x3 = —34 
—8x, + x, — 2x, = —40 


10.6 Develop your own M-file to determine the LU factor- 
ization of a square matrix without partial pivoting. That is, 
develop a function that is passed the square matrix and re- 
turns the triangular matrices [L] and [U]. Test your function 
by using it to solve the system in Prob. 10.3. Confirm that 
your function is working properly by verifying that [L][U] = 
[A] and by using the built-in function lu. 

10.7 Confirm the validity of the Cholesky factorization of 
Example 10.5 by substituting the results into Eq. (10.14) to 
verify that the product of [U]’ and [U] yields [A]. 

10.8 (a) Perform a Cholesky factorization of the following 
symmetric system by hand: 


8 20 15 x 50 
20 80 50 X2 ¢ = 4 250 
15 50 60 X 100 


(b) Verify your hand calculation with the built-in chol func- 
tion. (c) Employ the results of the factorization [U] to deter- 
mine the solution for the right-hand-side vector. 

10.9 Develop your own M-file to determine the Cholesky 
factorization of a symmetric matrix without pivoting. That 


is, develop a function that is passed the symmetric matrix 
and returns the matrix [U]. Test your function by using 
it to solve the system in Prob. 10.8 and use the built-in 
function chol to confirm that your function is working 
properly. 

10.10 Solve the following set of equations with LU factor- 
ization with pivoting: 


3x, — 2x, + x, = —10 
2x, + 6x, — 4x; = 44 
=x; — 2x, + 5x3 = —26 


10.11 (a) Determine the LU factorization without pivoting 
by hand for the following matrix and check your results by 
validating that [L][U] = [A]. 


8 2 1 

È 7 | 

239 
(b) Employ the result of (a) to compute the determinant. 
(c) Repeat (a) and (b) using MATLAB. 
10.12 Use the following LU factorization to (a) compute 


the determinant and (b) solve [A]{x} = {b} with {b}7 = 
|-10 44 —-—26| 


1 
[A] = [Z][U] = | 0.6667 1 | 
—0.3333 —0.3636 1 
3 -2 1 
x 7.3333 —4.6667 
3.6364 
10.13 Use Cholesky factorization to determine [U] so that 
2 -1 0 
[A] = [UJU] = |-1 2 -1 
0 -1 2 
10.14 Compute the Cholesky factorization of 
9 0 0 
[A]= |O 25 0 
0 0 4 


Do your results make sense in terms of Eqs. (10.15) and 
(10.16)? 
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Matrix Inverse and Condition 


CHAPTER OBJECTIVES 


The primary objective of this chapter is to show how to compute the matrix inverse 
and to illustrate how it can be used to analyze complex linear systems that occur in 
engineering and science. In addition, a method to assess a matrix solution’s sensitivity 
to roundoff error is described. Specific objectives and topics covered are 


Knowing how to determine the matrix inverse in an efficient manner based on LU 


factorization. 

Understanding how the matrix inverse can be used to assess stimulus-response 
characteristics of engineering systems. 

Understanding the meaning of matrix and vector norms and how they are computed. 
Knowing how to use norms to compute the matrix condition number. 
Understanding how the magnitude of the condition number can be used to esti- 
mate the precision of solutions of linear algebraic equations. 


THE MATRIX INVERSE 


In our discussion of matrix operations (Sec. 8.1.2), we introduced the notion that if a matrix 
[A] is square, there is another matrix [AJ], called the inverse of [A], for which 


[AJA]! = [A] IA] = [7] (11.1) 


Now we will focus on how the inverse can be computed numerically. Then we will explore 
how it can be used for engineering analysis. 


11.1.1 Calculating the Inverse 


The inverse can be computed in a column-by-column fashion by generating solutions with 
unit vectors as the right-hand-side constants. For example, if the right-hand-side constant 
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has a 1 in the first position and zeros elsewhere, 


1 
w= {ol (11.2) 
0 


the resulting solution will be the first column of the matrix inverse. Similarly, if a unit vec- 
tor with a | at the second row is used 


0 
w= qi} (11.3) 
0 


the result will be the second column of the matrix inverse. 

The best way to implement such a calculation is with LU factorization. Recall that one 
of the great strengths of LU factorization is that it provides a very efficient means to evalu- 
ate multiple right-hand-side vectors. Thus, it is ideal for evaluating the multiple unit vectors 
needed to compute the inverse. 


Matrix Inversion 


Problem Statement. Employ LU factorization to determine the matrix inverse for the 
system from Example 10.1: 


3 -0.1 =0.2 
[A]= |01 7 -03 
0.3 —0.2 10 
Recall that the factorization resulted in the following lower and upper triangular matrices: 


3 -0.1 —0.2 1 0 0 
[U]= |0 7.00333 —0.293333 [L] = | 0.0333333 1 0 
0 0 10.0120 0.100000 —0.0271300 1 


Solution. The first column of the matrix inverse can be determined by performing the 
forward-substitution solution procedure with a unit vector (with | in the first row) as the 
right-hand-side vector. Thus, the lower triangular system can be set up as [recall Eq. (10.8)] 


1 0 g f4 1 
0:0333333 1 0 dp = 40 
0.100000 —0.0271300 1 d, 0 
and solved with forward substitution for {d}" = |1 —0.03333 —0.1009 |. This vector can 
then be used as the right-hand side of the upper triangular system [recall Eq. (10.3)] 


3 -0.1 —0.2 my 1 
0 7.00333 —0.293333 X2 $ = < —0.03333 
0 0 10.0120 AS —0.1009 


which can be solved by back substitution for {x}" = [0.33249 —0.00518 —0.01008], which 
is the first column of the matrix inverse: 


0.33249 0 0 
[A]' = |-0.00518 0 0 


—0.01008 0 0 
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To determine the second column, Eq. (10.8) is formulated as 


1 0 oy [4% 0 
0.0333333 1 0 db = 4 1 
0.100000  —0.0271300 1 dz 0 
This can be solved for {d }, and the results are used with Eq. (10.3) to determine {x}’ = 
[0.004944 0.142903 0.00271], which is the second column of the matrix inverse: 


—0.00518 0.142903 0 


0.33249 0.004944 0 
[A] = 
—0.01008 0.002710 0 


Finally, the same procedures can be implemented with {b}’ = [0 0 1] to solve for {x}" = 
[0.006798 0.004183 0.09988], which is the final column of the matrix inverse: 


—0.00518 0.142903 0.004183 


, | 0.33249 0.004944 n0013] 
[A] = 
—0.01008 0.002710 0.099880 


The validity of this result can be checked by verifying that [A] [A]! 


Il 
= 

`~ 
a 


11.1.2 Stimulus-Response Computations 


As discussed in PT 3.1, many of the linear systems of equations arising in engineering and 
science are derived from conservation laws. The mathematical expression of these laws 
is some form of balance equation to ensure that a particular property—mass, force, heat, 
momentum, electrostatic potential—is conserved. For a force balance on a structure, the 
properties might be horizontal or vertical components of the forces acting on each node 
of the structure. For a mass balance, the properties might be the mass in each reactor of a 
chemical process. Other fields of engineering and science would yield similar examples. 

A single balance equation can be written for each part of the system, resulting in a set 
of equations defining the behavior of the property for the entire system. These equations 
are interrelated, or coupled, in that each equation may include one or more of the variables 
from the other equations. For many cases, these systems are linear and, therefore, of the 
exact form dealt with in this chapter: 


[A]{x} = {b} (11.4) 


Now, for balance equations, the terms of Eq. (11.4) have a definite physical interpre- 
tation. For example, the elements of {x} are the levels of the property being balanced for 
each part of the system. In a force balance of a structure, they represent the horizontal and 
vertical forces in each member. For the mass balance, they are the mass of chemical in each 
reactor. In either case, they represent the system’s state or response, which we are trying 
to determine. 

The right-hand-side vector {b} contains those elements of the balance that are inde- 
pendent of behavior of the system—that is, they are constants. In many problems, they 
represent the forcing functions or external stimuli that drive the system. 
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Finally, the matrix of coefficients [A] usually contains the parameters that express 
how the parts of the system interact or are coupled. Consequently, Eq. (11.4) might be 
reexpressed as 


[Interactions]{response} = {stimuli} 


As we know from previous chapters, there are a variety of ways to solve Eq. (11.4). 
However, using the matrix inverse yields a particularly interesting result. The formal solu- 
tion can be expressed as 


{x} = [AT {b} 
or (recalling our definition of matrix multiplication from Sec. 8.1.2) 
x, = aii b; Fii b, + an b; 


i= azi b, + Ga b, + aza b; 
i= azi b; Faz b, + Gs b; 

Thus, we find that the inverted matrix itself, aside from providing a solution, has ex- 
tremely useful properties. That is, each of its elements represents the response of a single 
part of the system to a unit stimulus of any other part of the system. 

Notice that these formulations are linear and, therefore, superposition and proportion- 
ality hold. Superposition means that if a system is subject to several different stimuli (the 
b’s), the responses can be computed individually and the results summed to obtain a total 
response. Proportionality means that multiplying the stimuli by a quantity results in the 
response to those stimuli being multiplied by the same quantity. Thus, the coefficient äu is 
a proportionality constant that gives the value of x, due to a unit level of b,. This result is 
independent of the effects of b, and b, on x,, which are reflected in the coefficients ai and 
a , respectively. Therefore, we can draw the general conclusion that the element a; of the 
inverted matrix represents the value of x; due to a unit quantity of b,. 

Using the example of the structure, element aij, of the matrix inverse would represent 
the force in member i due to a unit external force at node j. Even for small systems, such 
behavior of individual stimulus-response interactions would not be intuitively obvious. As 
such, the matrix inverse provides a powerful technique for understanding the interrelation- 
ships of component parts of complicated systems. 


Analyzing the Bungee Jumper Problem 


Problem Statement. At the beginning of Chap. 8, we set up a problem involving three 
individuals suspended vertically connected by bungee cords. We derived a system of linear 
algebraic equations based on force balances for each jumper, 


150 -100 0 X 588.6 
—100 150 -50 X2 > = 4 686.7 
0 -50 50 x3 784.8 


In Example 8.2, we used MATLAB to solve this system for the vertical positions of the 
jumpers (the x’s). In the present example, use MATLAB to compute the matrix inverse and 
interpret what it means. 
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Solution. Start up MATLAB and enter the coefficient matrix: 
>> K = [150 -100 0;-100 150 -50;0 -50 50]; 
The inverse can then be computed as 


>> KI = inv(K) 


0.0200 0.0200 0.0200 
0.0200 0.0300 0.0300 
0.0200 0.0300 0.0500 


Each element of the inverse, ki of the inverted matrix represents the vertical change 
in position (in meters) of jumper i due to a unit change in force (in Newtons) applied to 
jumper j. 

First, observe that the numbers in the first column (j = 1) indicate that the position 
of all three jumpers would increase by 0.02 m if the force on the first jumper was increased 
by 1 N. This makes sense, because the additional force would only elongate the first cord 
by that amount. 

In contrast, the numbers in the second column ( j = 2) indicate that applying a force 
of 1 N to the second jumper would move the first jumper down by 0.02 m, but the second 
and third by 0.03 m. The 0.02-m elongation of the first jumper makes sense because the 
first cord is subject to an extra 1 N regardless of whether the force is applied to the first 
or second jumper. However, for the second jumper the elongation is now 0.03 m because 
along with the first cord, the second cord also elongates due to the additional force. And of 
course, the third jumper shows the identical translation as the second jumper as there is no 
additional force on the third cord that connects them. 

As expected, the third column ( j = 3) indicates that applying a force of 1 N to the 
third jumper results in the first and second jumpers moving the same distances as occurred 
when the force was applied to the second jumper. However, now because of the additional 
elongation of the third cord, the third jumper is moved farther downward. 

Superposition and proportionality can be demonstrated by using the inverse to deter- 
mine how much farther the third jumper would move downward if additional forces of 10, 
50, and 20 N were applied to the first, second, and third jumpers, respectively. This can be 
done simply by using the appropriate elements of the third row of the inverse to compute, 


Ax, =k AF, +k AF, + kz AF, = 0.02(10) + 0.03(50) + 0.05(20) = 2.7 m 


ERROR ANALYSIS AND SYSTEM CONDITION 


Aside from its engineering and scientific applications, the inverse also provides a means to 
discern whether systems are ill-conditioned. Three direct methods can be devised for this 
purpose: 


1. Scale the matrix of coefficients [A] so that the largest element in each row is 1. Invert 
the scaled matrix and if there are elements of [A]7! that are several orders of magnitude 
greater than one, it is likely that the system is ill-conditioned. 
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2. Multiply the inverse by the original coefficient matrix and assess whether the result is 
close to the identity matrix. If not, it indicates ill-conditioning. 

3. Invert the inverted matrix and assess whether the result is sufficiently close to the 
original coefficient matrix. If not, it again indicates that the system is ill-conditioned. 


Although these methods can indicate ill-conditioning, it would be preferable to obtain 
a single number that could serve as an indicator of the problem. Attempts to formulate such 
a matrix condition number are based on the mathematical concept of the norm. 


11.2.1 Vector and Matrix Norms 


A norm is a real-valued function that provides a measure of the size or “length” of multi- 
component mathematical entities such as vectors and matrices. 

A simple example is a vector in three-dimensional Euclidean space (Fig. 11.1) that 
can be represented as 


lFl=la b c] 


where a, b, and c are the distances along the x, y, and z axes, respectively. The length of 
this vector—that is, the distance from the coordinate (0, 0, 0) to (a, b, c)—can be simply 
computed as 


IIF] = Va + + e? 
where the nomenclature ||F||, indicates that this length is referred to as the Euclidean norm 
of [F]. 


Similarly, for an n-dimensional vector |X] = |x, x, +++  x,],a Euclidean norm 
would be computed as 


n 2 
I= yÈ ~x 


FIGURE 11.1 
Graphical depiction of a vector in Euclidean space. 


=Y 
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The concept can be extended further to a matrix [A], as in 


(11.5) 


which is given a special name—the Frobenius norm. As with the other vector norms, it 
provides a single value to quantify the “size” of [A]. 

It should be noted that there are alternatives to the Euclidean and Frobenius norms. For 
vectors, there are alternatives called p norms that can be represented generally by 


Ix, = (È bs)" 


We can see that the Euclidean norm and the 2 norm, ||X||,, are identical for vectors. 
Other important examples are (p = 1) 


IXI = 3 Ix, 


which represents the norm as the sum of the absolute values of the elements. Another is the 
maximum-magnitude or uniform-vector norm (p = oo), 


[Xl]. = max |x| 


which defines the norm as the element with the largest absolute value. 
Using a similar approach, norms can be developed for matrices. For example, 


n 
All; = max 27a, 


That is, a summation of the absolute values of the coefficients is performed for each col- 
umn, and the largest of these summations is taken as the norm. This is called the column- 
sum norm. 

A similar determination can be made for the rows, resulting in a uniform-matrix or 
row-sum norm: 


n 
A|| = max a, 
All. = pax È a 
It should be noted that, in contrast to vectors, the 2 norm and the Frobenius norm for 


a matrix are not the same. Whereas the Frobenius norm ||A||, can be easily determined by 
Eq. (11.5), the matrix 2 norm ||A]|, is calculated as 


I|A Ilə = (r 


where fax Is the largest eigenvalue of [A]’[A]. In Chap. 13, we will learn more about ei- 
genvalues. For the time being, the important point is that the ||A]|,, or spectral norm, is the 
minimum norm and, therefore, provides the tightest measure of size (Ortega, 1972). 
11.2.2 Matrix Condition Number 

Now that we have introduced the concept of the norm, we can use it to define 


Cond[A] = ||A]| - ||A7'|| 
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where Cond[A] is called the matrix condition number. Note that for a matrix [A], this 
number will be greater than or equal to 1. It can be shown (Ralston and Rabinowitz, 1978; 
Gerald and Wheatley, 1989) that 


IAXI < conaray LAA 
IXI IIAll 
That is, the relative error of the norm of the computed solution can be as large as the rela- 
tive error of the norm of the coefficients of [A] multiplied by the condition number. For 
example, if the coefficients of [A] are known to t-digit precision (i.e., rounding errors are 
on the order of 107) and Cond[A] = 10°, the solution [X] may be valid to only t — c digits 
(rounding errors 10°“). 


Matrix Condition Evaluation 


Problem Statement. The Hilbert matrix, which is notoriously ill-conditioned, can be 
represented generally as 


"nje = 
e wle pj 
"BIR ole 


1 tL, eae +S 


Use the row-sum norm to estimate the matrix condition number for the 3 x 3 Hilbert matrix: 


m 
> 
atr, 
II 

vl= nie = 
wl 

wale Ble wle 


Solution. First, the matrix can be normalized so that the maximum element in each row is 1: 


1 
[A] =| 1 
1 


Alo wily ni 
ule j= wle 


Summing each of the rows gives 1.833, 2.1667, and 2.35. Thus, the third row has the larg- 
est sum and the row-sum norm is 


Sjeo pon 
lAl =1+$+3= 2.35 


The inverse of the scaled matrix can be computed as 
9 -18 10 
[A] = |-36 96 -60 
30 -90 60 


Note that the elements of this matrix are larger than the original matrix. This is also re- 
flected in its row-sum norm, which is computed as 


IAI = |-36] + [96| + |-60| = 192 
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Thus, the condition number can be calculated as 


Cond[A] = 2.35(192) = 451.2 


The fact that the condition number is much greater than unity suggests that the sys- 
tem is ill-conditioned. The extent of the ill-conditioning can be quantified by calculating 
c = log 451.2 = 2.65. Hence, the last three significant digits of the solution could ex- 
hibit rounding errors. Note that such estimates almost always overpredict the actual error. 
However, they are useful in alerting you to the possibility that roundoff errors may be 
significant. 


EXAMPLE 11.4 


11.2.3 Norms and Condition Number in MATLAB 
MATLAB has built-in functions to compute both norms and condition numbers: 


>> norm(X,p) 
and 


>> cond (X,p) 


where X is the vector or matrix and p designates the type of norm or condition number 
(1, 2, inf, or 'fro'). Note that the cond function is equivalent to 


>> norm(X,p) * norm(inv(X),p) 


Also, note that if pis omitted, it is automatically set to 2. 


Matrix Condition Evaluation with MATLAB 


Problem Statement. Use MATLAB to evaluate both the norms and condition numbers 
for the scaled Hilbert matrix previously analyzed in Example 11.3: 


1 
[A]=|1 
1 


Alw wily ple 
lw wile wle 


(a) As in Example 11.3, first compute the row-sum versions (p = inf). (b) Also compute 
the Frobenius (p = 'fro') and the spectral (p = 2) condition numbers. 


Solution: (a) First, enter the matrix: 

>> A = [1 1/2 1/3;1 2/3 1/2;1 3/4 3/5]; 

Then, the row-sum norm and condition number can be computed as 
>> norm(A, inf) 


ans = 
2.3500 


>> cond (A, inf) 


ans = 
451.2000 
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These results correspond to those that were calculated by hand in Example 11.3. 
(b) The condition numbers based on the Frobenius and spectral norms are 
>> cond(A,'fro') 


ans = 
368.0866 


>> cond(A) 


ans = 
366. 3503 


LLa or NES ae INDOOR AIR POLLUTION 


Background. As the name implies, indoor air pollution deals with air contamination in 
enclosed spaces such as homes, offices, and work areas. Suppose that you are studying the 
ventilation system for Bubba’s Gas ’N Guzzle, a truck-stop restaurant located adjacent to 
an eight-lane freeway. 

As depicted in Fig. 11.2, the restaurant serving area consists of two rooms for smokers 
and kids and one elongated room. Room 1 and section 3 have sources of carbon monoxide 
from smokers and a faulty grill, respectively. In addition, rooms 1 and 2 gain carbon mon- 
oxide from air intakes that unfortunately are positioned alongside the freeway. 


FIGURE 11.2 

Overhead view of rooms in a restaurant. The one-way arrows represent volumetric airflows, 
whereas the two-way arrows represent diffusive mixing. The smoker and grill loads add car- 
bon monoxide mass to the system but negligible airflow. 


Q. =150 m°/hr 4$ Qu = 100 m*/hr $ 
| | 
Q, = 50 m°/hr 2 = all a 
cy = 2 mg/m? (Kids’ section) ~< i 
= 
£ 
[= -E J= =p 
2 
= 3 
Q, = 200 m?/hr Bi l 25 mĉ/hr 
E 3 Bo i 
e = 2 mg/m (Smoking section) g j 
Smoker load ===- 
(1000 mg/hr) t 
Grill load 


(2000 mg/hr) 
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Write steady-state mass balances for each room and solve the resulting linear al- 
gebraic equations for the concentration of carbon monoxide in each room. In addition, 
generate the matrix inverse and use it to analyze how the various sources affect the kids’ 
room. For example, determine what percent of the carbon monoxide in the kids’ section 
is due to (1) the smokers, (2) the grill, and (3) the intake vents. In addition, compute the 
improvement in the kids’ section concentration if the carbon monoxide load is decreased 
by banning smoking and fixing the grill. Finally, analyze how the concentration in the 
kids’ area would change if a screen is constructed so that the mixing between areas 2 and 
4 is decreased to 5 m*/hr. 


Solution. Steady-state mass balances can be written for each room. For example, the bal- 
ance for the smoking section (room 1) is 


0 a Wemover ar Orc; z Oc; ar E,3(c, F cı) 
(Load) + (Inflow) — (Outflow) + (Mixing) 


Similar balances can be written for the other rooms: 


0=Q,c, + (Q, — Qa Oe + EC Ce) 
0 = Wain + Q,€, + E:3(c, — C3) + Eza(cy — €3) — O03 


O=Q,c, + Elez = C4) + El = Cy) — OC 


Substituting the parameters yields the final system of equation: 


25 © 5 ci 1400 
~ m © 125 | tol) 100 

D5 WS we ~ \a000 
QO ss =250 275 LI (|G 0 


MATLAB can be used to generate the solution. First, we can compute the inverse. 
Note that we use the “short g” format in order to obtain five significant digits of precision: 


>> format short g 
>> A=[225 0 -25 0 
(0) L745) (0) ales} 
7225102/550 

0) 25) 250275; 
>> AI=inv(A) 


AI = 
0.0049962 1.5326e-005 0.00055172 0.00010728 
0.0034483 0.0062069 0.0034483 0.0034483 
0.0049655 0.00013793 0.0049655 0.00096552 
0.0048276 0.00068966 0.0048276 0.0048276 
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11.3 CASE STUDY continued 


The solution can then be generated as 


>> b=[1400 100 2000 0]'; 


>> c=AI*b 

E = 
8.0996 
12.345 
16.897 
16.483 


Thus, we get the surprising result that the smoking section has the lowest carbon mon- 
oxide levels! The highest concentrations occur in rooms 3 and 4 with section 2 having an 
intermediate level. These results take place because (a) carbon monoxide is conservative 
and (b) the only air exhausts are out of sections 2 and 4 (Q, and Q,). Room 3 is so bad 
because not only does it get the load from the faulty grill, but it also receives the effluent 
from room 1. 

Although the foregoing is interesting, the real power of linear systems comes from 
using the elements of the matrix inverse to understand how the parts of the system interact. 
For example, the elements of the matrix inverse can be used to determine the percent of the 
carbon monoxide in the kids’ section due to each source: 

The smokers: 
= i W, 


smokers 


= 0.0034483(1000) = 3.4483 


C2 smokers 


_ 3.4483 
smokers — 12.345 x 100% = 27.93% 


The grill: 
Co grill = az; W, Wrin = 0.0034483(2000) = 6.897 


% 


6.897. 
ua = 75 345 X 100% = 55.87% 


The intakes: 
C3 intakes = os (Qe, ar Gs Q,c, = 0.0034483(200)2 + 0.0062069(50)2 
= l g + 0.62069 = 2 


% x 100% = 16.20% 


gill ~ 72,345 z 
The faulty grill is clearly the most significant source. 

The inverse can also be employed to determine the impact of proposed remedies such 
as banning smoking and fixing the grill. Because the model is linear, superposition holds 


and the results can be determined individually and summed: 
Ac, = =a] AW moka = 0.0034483(—1000) + 0.0034483(—2000) 


= —3.4483 — 6.8966 = —10.345 


SF os AW. 


grill 
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11.3 CASE STUDY continued 


Note that the same computation would be made in MATLAB as 


>> AI(2,1)*(-1000) +AI(2,3)*(-2000) 


ans = 
-10.345 


Implementing both remedies would reduce the concentration by 10.345 mg/m’. The result 
would bring the kids’ room concentration to 12.345 — 10.345 = 2 mg/m?. This makes 
sense, because in the absence of the smoker and grill loads, the only sources are the air 
intakes which are at 2 mg/m’. 

Because all the foregoing calculations involved changing the forcing functions, it was 
not necessary to recompute the solution. However, if the mixing between the kids’ area and 
zone 4 is decreased, the matrix is changed 


mS 0 5 0 cy 1400 
© 166 © =I% Co} _ J100 

Is A WS =o €3( ~— )2000 
~ = =40 256 C4 0 


The results for this case involve a new solution. Using MATLAB, the result is 


c 8.1084 
| _ }12.0800 
C3 16.9760 
C4 16.8800 


Therefore, this remedy would only improve the kids’ area concentration by a paltry 
0.265 mg/m’. 


PROBLEMS 


11.1 Determine the matrix inverse for the following system: 11.2 Determine the matrix inverse for the following system: 


10x, + 2x,-x, =27 —8x, + xX, — 2x, = —20 
—3x, — 6x, + 2x, = —61.5 2x, — 6X, — x,=—38 
xi + Xy + 5x3 = —-21.5 —3x,— X, + 7x3 = —34 


Check your results by verifying that [A][A]~' = [/]. Do not 11.3 The following system of equations is designed to 
use a pivoting strategy. determine concentrations (the c’s in g/m) in a series of 


PROBLEMS 
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coupled reactors as a function of the amount of mass input 
to each reactor (the right-hand sides in g/day): 


15¢e;-— 3c,— c= 4000 
—3c; + 18c, — 6c, = 1200 
-4c — c + 12c; = 2350 


(a) Determine the matrix inverse. 

(b) Use the inverse to determine the solution. 

(c) Determine how much the rate of mass input to reactor 3 
must be increased to induce a 10 g/m? rise in the concen- 
tration of reactor 1. 

(d) How much will the concentration in reactor 3 be reduced 
if the rate of mass input to reactors 1 and 2 is reduced by 
500 and 250 g/day, respectively? 

11.4 Determine the matrix inverse for the system de- 

scribed in Prob. 8.9. Use the matrix inverse to determine the 

concentration in reactor 5 if the inflow concentrations are 

changed to cy, = 20 and co; = 50. 

11.5 Determine the matrix inverse for the system described 

in Prob. 8.10. Use the matrix inverse to determine the force 

in the three members (F,, F,, and F,) if the vertical load at 
node | is doubled to F, „= —2000 N and a horizontal load of 

F, „= —500 N is applied to node 3. 

11.6 Determine All; lAl], and ||A]],, for 


8 2 -10 
[A]= |-9 1 3 
15 =L 6 


Before determining the norms, scale the matrix by making 
the maximum element in each row equal to one. 

11.7 Determine the Frobenius and row-sum norms for the 
systems in Probs. 11.2 and 11.3. 

11.8 Use MATLAB to determine the spectral condition num- 
ber for the following system. Do not normalize the system: 


1 4 9 16 25 
4 9 16 25 36 
9 16 25 36 49 
16 25 36 49 64 
25 36 49 64 8I 


Compute the condition number based on the row-sum norm. 
11.9 Besides the Hilbert matrix, there are other matrices 
that are inherently ill-conditioned. One such case is the 
Vandermonde matrix, which has the following form: 


2 
xi x, 1 
2 
Xz X% 1 
2 
x3 xX, 1 


(a) Determine the condition number based on the row-sum 
norm for the case where x, = 4, x, = 2, and x, = 7. 
(b) Use MATLAB to compute the spectral and Frobenius 
condition numbers. 

11.10 Use MATLAB to determine the spectral condition 
number for a 10-dimensional Hilbert matrix. How many dig- 
its of precision are expected to be lost due to ill-conditioning? 
Determine the solution for this system for the case where 
each element of the right-hand-side vector {b} consists of 
the summation of the coefficients in its row. In other words, 
solve for the case where all the unknowns should be exactly 
one. Compare the resulting errors with those expected based 
on the condition number. 
11.11 Repeat Prob. 11.10, but for the case of a six- 
dimensional Vandermonde matrix (see Prob. 11.9) where 
x, = 4, xX) = 2, x; = 7, x, = 10, x; = 3, and x, = 5. 
11.12 The Lower Colorado River consists of a series of four 
reservoirs as shown in Fig. P11.12. 

Mass balances can be written for each reservoir, and 
the following set of simultaneous linear algebraic equations 
results: 


13422 0 0 0 
-13.422 12.252 0 0 
0 -12252 12377 0 
0 O —12.377 11.797 
ĉi 750.5 
c| _ } 300 
xX ef) 102 
c 30 


where the right-hand-side vector consists of the loadings of 

chloride to each of the four lakes and c}, c», c;, and c4 = the 

resulting chloride concentrations for Lakes Powell, Mead, 

Mohave, and Havasu, respectively. 

(a) Use the matrix inverse to solve for the concentrations in 
each of the four lakes. 

(b) How much must the loading to Lake Powell be reduced 
for the chloride concentration of Lake Havasu to be 75? 

(c) Using the column-sum norm, compute the condition 
number and how many suspect digits would be gener- 
ated by solving this system. 

11.13 (a) Determine the matrix inverse and condition num- 

ber for the following matrix: 


12 3 
4 5 6 
7 8 9 


(b) Repeat (a) but change a,, slightly to 9.1. 
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Upper 
Colorado 
River 


FIGURE P11.12 
The Lower Colorado River. 


11.14 Polynomial interpolation consists of determining the 
unique (n — 1)th-order polynomial that fits n data points. 
Such polynomials have the general form, 


+ pox”? ate + PypsX + Py 


n-1 


f (x) = px (P11.14) 


where the p’s are constant coefficients. A straightforward 
way for computing the coefficients is to generate n linear 
algebraic equations that we can solve simultaneously for 
the coefficients. Suppose that we want to determine the 
coefficients of the fourth-order polynomial f(x) = p,x* + 
PX? + pX + pyx + p; that passes through the following five 
points: (200, 0.746), (250, 0.675), (300, 0.616), (400, 0.525), 
and (500, 0.457). Each of these pairs can be substituted into 
Eq. (P11.14) to yield a system of five equations with five 
unknowns (the p’s). Use this approach to solve for the coef- 
ficients. In addition, determine and interpret the condition 
number. 

11.15 A chemical constituent flows between three reactors 
as depicted in Fig. P11.15. Steady-state mass balances can 
be written for a substance that reacts with first-order kinet- 
ics. For example, the mass balance for reactor 1 is 


Qi in€1in — Qy2€) — Qy 361 + Qaca — KVic, = 0 


where Q,,, = the volumetric inflow to reactor 1 (m*/min), 
Ci in = the inflow concentration to reactor 1 (g/m’*), Q, j= the 
flow from reactor i to reactor j (m*/min), c; = the concentra- 
tion of reactor i (g/m*), k = a first-order decay rate (/min), 
and V, = the volume of reactor i (m°). 

(a) Write the mass balances for reactors 2 and 3. 


(P11.15) 


Lake 
Mohave 


Lake 
Havasu 


(b) If k = 0.1/min, write the mass balances for all three 
reactors as a system of linear algebraic equations. 
Compute the LU decomposition for this system. 

Use the LU decomposition to compute the matrix inverse. 
Use the matrix inverse to answer the following questions: 
(i) What are the steady-state concentrations for the three 
reactors? (ii) If the inflow concentration to the second re- 
actor is set to zero, what is the resulting reduction in con- 
centration of reactor 1? (iii) If the inflow concentration 
to reactor 1 is doubled, and the inflow concentration to 
reactor 2 is halved, what is the concentration of reactor 3? 


(c) 
(d) 
(e) 


FIGURE P11.15 
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Load = 
È 3 
C= >0minr $ 5000 mghr 
= d =- | 
Q a ce Room 2 
c = 40 mg/m = 3 a 
4 50 m3/hr 5 5 
Q = 50 m3/hr S „5 
o Pomba > 
= 3 > 
Q Om ae Room 1 E z 
c=1 mg/m 50 m3/hr | 


FIGURE P11.18 


11.16 As described in Examples 8.2 and 11.2, use the ma- 

trix inverse to answer the following: 

(a) Determine the change in position of the first jumper, if 
the mass of the third jumper is increased to 100 kg. 

(b) What force must be applied to the third jumper so that 
the final position of the third jumper is 140 m? 

11.17 Determine the matrix inverse for the electric circuit 

formulated in Sec. 8.3. Use the inverse to determine the new 

current between nodes 2 and 5 (i;,), if a voltage of 200 V is 

applied at node 6 and the voltage at node 1 is halved. 

11.18 (a) Using the same approach as described in Sec. 11.3, 

develop steady-state mass balances for the room configura- 

tion depicted in Fig. P11.18. 

(b) Determine the matrix inverse and use it to calculate the 
resulting concentrations in the rooms. 

(c) Use the matrix inverse to determine how much the room 
4 load must be reduced to maintain a concentration of 
20 mg/m? in room 2. 

11.19 Write your own well-structured MATLAB Function 

procedure named Fnorm to calculate the Frobenius norm of 

an mxn matrix with for...end loops, 


m 


All = 4) 


2 
ij 
i=1 j=1 


Have the function scale the matrix before computing the 


norm. Test your function with the following script: 


A= [57 -9; 184; 762]; 
Fn = Fnorm(A) 


Here is the first line of your function 
function Norm = Fnorm(x) 


11.20 Figure P11.20 shows a statically determinate truss. 


‘heme ib 


FIGURE P11.20 
Forces on a statically determinate truss. 


This type of structure can be described as a system of cou- 
pled linear algebraic equations by developing free-body 
force diagrams for the forces at each node in Fig. P11.20. 
The sum of the forces in both horizontal and vertical direc- 
tions must be zero at each node, because the system is at rest. 
Therefore, for node 1, 


F,=0=-—F, cos 30° + F, cos 60° + F; 
F, =0 = —F, sin 30° — F, sin 60° + F,, 


for node 2, 
Fy =0=F, + F cos 30° + F, , + H, 
F,=0=F, sin 30° + F,, + V, 
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for node 3, 
Fy,=0=-— F, — F, cos 60° + F3, 
Fy =0= F; sin 60° + F3, + V3 


where F, is the external horizontal force applied to node i 
(where a positive force is from left to right) and F}; , is the ex- 
ternal vertical force applied to node i (where a positive force 
is upward). Thus, in this problem, the 1000-N downward 
force on node 1 corresponds to F, , =— 1000. For this case all 
other F;,,’s and F; s are zero. Note that the directions of the 
internal forces and reactions are unknown. Proper applica- 
tion of Newton’s laws requires only consistent assumptions 
regarding direction. Solutions are negative if the directions 
are assumed incorrectly. Also note that in this problem, the 
forces in all members are assumed to be in tension and act to 
pull adjoining nodes together. A negative solution therefore 
corresponds to compression. When the external forces are 
substituted and the trigonometric functions evaluated, this 
problem reduces to a set of six linear algebraic equations 
with six unknowns. 

(a) Solve for the forces and reactions for the case displayed 
in Fig. P11.20. 

(b) Determine the system’s matrix inverse. What is your 
interpretation of the zeros in the second row of the 
inverse? 

(c) Use the elements matrix inverse to answer the following 
questions: 

(i) If the force at node 1 was reversed (i.e., directed up- 
ward), compute the impact on H, and V}. 


(ii) If the force at node 1 was set to zero and horizon- 
tal forces of 1500 N were applied at nodes 1 and 2 
(Fin = Fn = 1500), what would be the vertical reac- 
tion at node 3 (V3). 
11.21 Employing the same approach as in Prob. 11.20, 
(a) Compute the forces and reactions for the members and 
supports for the truss depicted in Fig. P11.21. 
(b) Compute the matrix inverse. 
(c) Determine the change in the reactions at the two sup- 
ports if the force at the peak is directed upward. 


600 


FIGURE P11.21 


12.1 


Iterative Methods 


CHAPTER OBJECTIVES 


The primary objective of this chapter is to acquaint you with iterative methods for 
solving simultaneous equations. Specific objectives and topics covered are 


Understanding the difference between the Gauss-Seidel and Jacobi methods. 


Knowing how to assess diagonal dominance and knowing what it means. 
Recognizing how relaxation can be used to improve the convergence of iterative 
methods. 

Understanding how to solve systems of nonlinear equations with successive 
substitution, Newton-Raphson, and the MATLAB fsolve function. 


terative or approximate methods provide an alternative to the elimination methods 

described to this point. Such approaches are similar to the techniques we developed to 

obtain the roots of a single equation in Chaps. 5 and 6. Those approaches consisted of 
guessing a value and then using a systematic method to obtain a refined estimate of the 
root. Because the present part of the book deals with a similar problem—obtaining the 
values that simultaneously satisfy a set of equations—we might suspect that such approxi- 
mate methods could be useful in this context. In this chapter, we will present approaches 
for solving both linear and nonlinear simultaneous equations. 


LINEAR SYSTEMS: GAUSS-SEIDEL 


The Gauss-Seidel method is the most commonly used iterative method for solving linear 
algebraic equations. Assume that we are given a set of n equations: 


[A] {x} = {b} 


Suppose that for conciseness we limit ourselves to a 3 x 3 set of equations. If the diagonal 
elements are all nonzero, the first equation can be solved for x,, the second for x,, and the 
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third for x, to yield 


“i 
; by-apx] -apx 
j 17 42% 13%3 

“= a (12.1a) 

b j j-1 

j Pa — aN — Ay3%5 

x= = (12.1b) 
i 22 


E ae 

i 3 — 431%, — 43247 02109 
x= cle 
d a33 


where j and j — 1 are the present and previous iterations. 

To start the solution process, initial guesses must be made for the x’s. A simple ap- 
proach is to assume that they are all zero. These zeros can be substituted into Eq. (12.1a), 
which can be used to calculate a new value for x, = b,/a,,. Then we substitute this new 
value of x, along with the previous guess of zero for x, into Eq. (12.1b) to compute a new 
value for x,. The process is repeated for Eq. (12.1c) to calculate a new estimate for x,. Then 
we return to the first equation and repeat the entire procedure until our solution converges 
closely enough to the true values. Convergence can be checked using the criterion that for 
all i, 

x — a 


X 100% < £, (12.2) 


ai 


X, 
i 


Gauss-Seidel Method 


Problem Statement. Use the Gauss-Seidel method to obtain the solution for 


3x, -O.1x,-0.2x,= 7.85 
0.lx + 7x, —-0.3x,= —-19.3 
0.3x,-0.2x,+ 10x,= 71.4 


Note that the solution is x, = 3, x, = —2.5, and x, = 7. 


Solution. First, solve each of the equations for its unknown on the diagonal: 


_ 7.85 + 0.1x, + 0.2x, 


xX 5 (E12.1.1) 
~19.3 — 0.1x, + 0.3x, 

x = F (E12.1.2) 
71.4 — 0.3x, + 0.2x, 

x; = T (E12.1.3) 


By assuming that x, and x, are zero, Eq. (E12.1.1) can be used to compute 


-2 7.85 + 0.1(0) + 0.2(0) 


i : = 2.616667 
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This value, along with the assumed value of x, = 0, can be substituted into Eq. (E12.1.2) 
to calculate 


_ —19.3 — 0.1(2.616667) + 0.3(0) _ 


x= 7 —2.794524 


The first iteration is completed by substituting the calculated values for x, and x, into 
Eq. (E12.1.3) to yield 


x — 714 = 0.3(2.616667) + 0.2(-2.794524) 


7 7 = 7.005610 


For the second iteration, the same process is repeated to compute 


ace 7.85 + 0.1(—2.794524) + 0.2(7.005610) 


; = 2.990557 
pabi CEE + 0,3(7,005610) _ _» 499625 
a 71.4 — 0,3(2.990557) + 0.2(-2.499625) _ 7.000291 


3 10 


The method is, therefore, converging on the true solution. Additional iterations could be 
applied to improve the answers. However, in an actual problem, we would not know the 
true answer a priori. Consequently, Eq. (12.2) provides a means to estimate the error. For 
example, for x,: 


_ | 2.990557 — 2.616667 _ 
a= SESH x 100% = 12.5% 


For x, and x3, the error estimates are €,. = 11.8% and £, = 0.076%. Note that, as was the 
case when determining roots of a single equation, formulations such as Eq. (12.2) usually 
provide a conservative appraisal of convergence. Thus, when they are met, they ensure that 
the result is known to at least the tolerance specified by e,. 


As each new x value is computed for the Gauss-Seidel method, it is immediately used 
in the next equation to determine another x value. Thus, if the solution is converging, the 
best available estimates will be employed. An alternative approach, called Jacobi itera- 
tion, utilizes a somewhat different tactic. Rather than using the latest available x’s, this 
technique uses Eq. (12.1) to compute a set of new x’s on the basis of a set of old x’s. Thus, 
as new values are generated, they are not immediately used but rather are retained for the 
next iteration. 

The difference between the Gauss-Seidel method and Jacobi iteration is depicted in 
Fig. 12.1. Although there are certain cases where the Jacobi method is useful, Gauss-Seidel’s 
utilization of the best available estimates usually makes it the method of preference. 
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First iteration 


xı = (b; = a12% — 44 3%3)/ ay, = (by = 412X — 413%3)/411 


Xp = (by = a21 X1 — 493%3)/Ayy X= (by — ay, X1 — 4933) /dp 


X3 = (b; — a31 X1 — d32Xp)/dz3 = (b; = 431 X1 — 432X3)/ 433 


} Second iteration 


x = (b = apaa = a13) / an % =O a3 = ag)/ar 
X2 = (Dy = Gg) X1 — Ay3%3)/ Ay Xq = (Dy = Gy) X, — a2343)/ a17 
X3 = (b; — 431X; — G39Xp)/a33 x3 = (b; — a31 %1 — 4323%3)/ a33 
(a) (b) 
FIGURE 12.1 


Graphical depiction of the difference between (a) the Gauss-Seidel and (b) the Jacobi 
iterative methods for solving simultaneous linear algebraic equations. 


12.1.1 Convergence and Diagonal Dominance 


Note that the Gauss-Seidel method is similar in spirit to the technique of simple fixed-point 
iteration that was used in Sec. 6.1 to solve for the roots of a single equation. Recall that 
simple fixed-point iteration was sometimes nonconvergent. That is, as the iterations pro- 
gressed, the answer moved farther and farther from the correct result. 

Although the Gauss-Seidel method can also diverge, because it is designed for linear 
systems, its ability to converge is much more predictable than for fixed-point iteration of 
nonlinear equations. It can be shown that if the following condition holds, Gauss-Seidel 
will converge: 


jail > È Ja] (12.3) 


That is, the absolute value of the diagonal coefficient in each of the equations must be 
larger than the sum of the absolute values of the other coefficients in the equation. Such 
systems are said to be diagonally dominant. This criterion is sufficient but not necessary 
for convergence. That is, although the method may sometimes work if Eq. (12.3) is not 
met, convergence is guaranteed if the condition is satisfied. Fortunately, many engineer- 
ing and scientific problems of practical importance fulfill this requirement. Therefore, 
Gauss-Seidel represents a feasible approach to solve many problems in engineering 
and science. 
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12.1.2 MATLAB M-file: GaussSeidel 


Before developing an algorithm, let us first recast Gauss-Seidel in a form that is com- 
patible with MATLAB’s ability to perform matrix operations. This is done by expressing 
Eq. (12.1) as 


Pa by Z aiz old _ 413 old 
i ayy ay? ay”? 
new b, dz] new d3 old 
x. = 2 pea 

2 Any a` 1 Ay 3 
Pa bs a31 new 439 new 

7 a33 43371 433° 2 


Notice that the solution can be expressed concisely in matrix form as 


{x} = {d} — [C] {x} (12.4) 
where 
b,/ay, 
{d} = 4 b/a 
b,/a3; 
and 


0 a2/4 3/41 
[C] =] a/an 0 Ay3/Ayy 


az;/a33 A3>/433 0 
An M-file to implement Eq. (12.4) is listed in Fig. 12.2. 


12.1.3 Relaxation 


Relaxation represents a slight modification of the Gauss-Seidel method that is designed to 
enhance convergence. After each new value of x is computed using Eq. (12.1), that value 
is modified by a weighted average of the results of the previous and the present iterations: 


xp =a + (1 ax (12.5) 
where A is a weighting factor that is assigned a value between 0 and 2. 

If à = 1, (1 — A) is equal to O and the result is unmodified. However, if À is set at a 
value between 0 and 1, the result is a weighted average of the present and the previous re- 
sults. This type of modification is called underrelaxation. It is typically employed to make 
a nonconvergent system converge or to hasten convergence by dampening out oscillations. 

For values of à from 1 to 2, extra weight is placed on the present value. In this instance, 
there is an implicit assumption that the new value is moving in the correct direction toward 
the true solution but at too slow a rate. Thus, the added weight of A is intended to improve 
the estimate by pushing it closer to the truth. Hence, this type of modification, which is 
called overrelaxation, is designed to accelerate the convergence of an already convergent 
system. The approach is also called successive overrelaxation, or SOR. 
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function x = GaussSeidel(A,b,es,maxit) 

% GaussSeidel: Gauss Seidel method 

% x= GaussSeidel(A,b): Gauss Seidel without relaxation 
% input: 

% A = coefficient matrix 

% b= right hand side vector 

% es = stop criterion (default = 0.00001%) 

% maxit = max iterations (default = 50) 

% output: 

% x= solution vector 


if nargin<2,error('at least 2 input arguments required'),end 
if nargin<4|isempty(maxit) ,maxit=50;end 

if nargin<3|isempty(es),es=0.00001; end 

[m,n] = size(A); 

if m~=n, error('Matrix A must be square'); end 


C=A; 
for i= 1:n 
Gap) = 0; 
x(i) = 0; 
end 
X= xX'; 
for i= 1:n 
C(i,1:n) E A e 
end 
for i= 1:n 
d(i) = A 
end 
iter = 0; 
while (1) 
xold = x; 
for i =1:n 
KUD = le) eer 
if x(i) ~=0 
ea(i) = abs((x(i) - xold(i))/x(i)) * 100; 
end 
end 


item = iter; 
if max(ea)<=es | iter >= maxit, break, end 
end 


FIGURE 12.2 
MATLAB M-file to implement Gauss-Seidel. 


The choice of a proper value for à is highly problem-specific and is often determined 
empirically. For a single solution of a set of equations it is often unnecessary. However, 
if the system under study is to be solved repeatedly, the efficiency introduced by a wise 
choice of à can be extremely important. Good examples are the very large systems of linear 
algebraic equations that can occur when solving partial differential equations in a variety of 
engineering and scientific problem contexts. 
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EXAMPLE 12.2 


Gauss-Seidel Method with Relaxation 


Problem Statement. Solve the following system with Gauss-Seidel using overrelaxation 
(A = 1.2) and a stopping criterion of e, = 10%: 


—3x, + 12x, =9 
10x, -— 2x,=8 


Solution. First rearrange the equations so that they are diagonally dominant and solve the 
first equation for x, and the second for x,: 


8+2 
x= 5 = 0.8 + 0.2x, 
943 
x= = 0.75 + 0.25%, 


First iteration: Using initial guesses of x, = x, = 0, we can solve for x: 
x, = 0.8 + 0.2(0) = 0.8 

Before solving for x,, we first apply relaxation to our result for x;: 
xX, = 1.2(0.8) — 0.2(0) = 0.96 


We use the subscript r to indicate that this is the “relaxed” value. This result is then used 
to compute x,: 


x, = 0.75 + 0.25(0.96) = 0.99 
We then apply relaxation to this result to give 
Xz, = 1.2(0.99) — 0.2(0) = 1.188 


At this point, we could compute estimated errors with Eq. (12.2). However, since we 
started with assumed values of zero, the errors for both variables will be 100%. 


Second iteration: Using the same procedure as for the first iteration, the second iteration 
yields 


x, = 0.8 + 0.2(1.188) = 1.0376 
x, „= 1.2(1.0376) — 0.2(0.96) = 1.05312 


1.05312 
x, = 0.75 + 0.25(1.05312) = 1.01328 
X, , = 1.2(1.01328) — 0.211.188) = 0.978336 


— | 0.978336 — 1.188 
= 0.978336 


e= jese x 100% = 8.84% 


x 100% = 21.43% 


Because we have now have nonzero values from the first iteration, we can compute ap- 
proximate error estimates as each new value is computed. At this point, although the error 
estimate for the first unknown has fallen below the 10% stopping criterion, the second has 
not. Hence, we must implement another iteration. 
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Third iteration: 
x, = 0.8 + 0.2(0.978336) = 0.995667 
xX, = 1.2(0.995667) — 0.2(1.05312) = 0.984177 


_ |0.984177 — 1.05312 = 
Ea = 0.984177 x 100% = 7.01% 


x, = 0.75 + 0.25(0.984177) = 0.996044 
X, , = 1.2(0.996044) — 0.2(0.978336) = 0.999586 


_ |0.999586 — 0.978336 = 
€a2 = | 0.999586 x 100% = 2.13% 


At this point, we can terminate the computation because both error estimates have 
fallen below the 10% stopping criterion. The results at this juncture, x, = 0.984177 and 
x, = 0.999586, are converging on the exact solution of x, =x, = 1. 


12.2 NONLINEAR SYSTEMS 
The following is a set of two simultaneous nonlinear equations with two unknowns: 
x, +x = 10 (12.6a) 
X, + 3x =57 (12.6b) 


In contrast to linear systems which plot as straight lines (recall Fig. 9.1), these equations 
plot as curves on an x, versus x, graph. As in Fig. 12.3, the solution is the intersection of 
the curves. 


FIGURE 12.3 
Graphical depiction of the solution of two simultaneous nonlinear equations. 


xX? + XX) = 10 


Solution 
Xj 2 — 


á= Xp + 3x,x3 = 57 
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EXAMPLE 12.3 


Just as we did when we determined roots for single nonlinear equations, such systems 
of equations can be expressed generally as 


fka ee a) =0 


Spr ks 
AM Xn) = (12.7) 


FX, Kope 5 X,) =O 


Therefore, the solution are the values of the x’s that make the equations equal to zero. 


12.2.1 Successive Substitution 


A simple approach for solving Eq. (12.7) is to use the same strategy that was employed for 
fixed-point iteration and the Gauss-Seidel method. That is, each one of the nonlinear equa- 
tions can be solved for one of the unknowns. These equations can then be implemented 
iteratively to compute new values which (hopefully) will converge on the solutions. This 
approach, which is called successive substitution, 1s illustrated in the following example. 


Successive Substitution for a Nonlinear System 


Problem Statement. Use successive substitution to determine the roots of Eq. (12.6). 
Note that a correct pair of roots is x, = 2 and x, = 3. Initiate the computation with guesses of 
x, = 1.5 and x, = 3.5. 


Solution. Equation (12.6a) can be solved for 


10—x, 
=z (E12.3.1) 
and Eq. (12.6b) can be solved for 
xy = 57 = 3x,x, (E12.3.2) 


On the basis of the initial guesses, Eq. (E12.3.1) can be used to determine a new value 
of xi: 

_ 10- (1.5) 
“5—35 
This result and the initial value of x, = 3.5 can be substituted into Eq. (E12.3.2) to deter- 

mine a new value of x,: 


= 2.21429 


X, = 57 — 3(2.21429)(3.5)? = —24.37516 


Thus, the approach seems to be diverging. This behavior is even more pronounced on the 
second iteration: 


_ 2 
x, = 2222142 = -0.20910 


xX = 57 — 3(—0.20910)(—24.37516)? = 429.709 


Obviously, the approach is deteriorating. 
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Now we will repeat the computation but with the original equations set up in a differ- 
ent format. For example, an alternative solution of Eq. (12.6a) is 


x, = VY 10 — xx 


and of Eq. (12.6) is 


_ [ST- XxX 
= 3x, 


Now the results are more satisfactory: 


x, = V10 — 1.5(3.5) = 2.17945 


_ [51-35 _ 
*a= 1309, 17945) — 780091 


x, = V10 — 2.17945(2.86051) = 1.94053 


_ [51-2860051 _ 
Bay 34053) 


Thus, the approach is converging on the true values of x, = 2 and x, = 3. 


The previous example illustrates the most serious shortcoming of successive 
substitution—that is, convergence often depends on the manner in which the equations are 
formulated. Additionally, even in those instances where convergence is possible, divergence 
can occur if the initial guesses are insufficiently close to the true solution. These criteria 
are so restrictive that fixed-point iteration has limited utility for solving nonlinear systems. 


12.2.2 Newton-Raphson 


Just as fixed-point iteration can be used to solve systems of nonlinear equations, other 
open root location methods such as the Newton-Raphson method can be used for the same 
purpose. Recall that the Newton-Raphson method was predicated on employing the deriva- 
tive (i.e., the slope) of a function to estimate its intercept with the axis of the independent 
variable—that is, the root. In Chap. 6, we used a graphical derivation to compute this esti- 
mate. An alternative is to derive it from a first-order Taylor series expansion: 


faa SfE) + Win, — 4) LOD (12.8) 


where x, is the initial guess at the root and x,,, is the point at which the slope intercepts the 
x axis. At this intercept, f(x;,,,) by definition equals zero and Eq. (12.8) can be rearranged 
to yield 

i+] i f( x) 
which is the single-equation form of the Newton-Raphson method. 


The multiequation form is derived in an identical fashion. However, a multivariable 
Taylor series must be used to account for the fact that more than one independent variable 


(12.9) 
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EXAMPLE 12.4 


contributes to the determination of the root. For the two-variable case, a first-order Taylor 
series can be written for each nonlinear equation as 


ofii ofii 

fiina Sfi + Orm — 1) Ox, + (Xp 541 — Xp) Ox, (12.10a) 
Of, Of, 

frist Sfi + Orin T 413) a + (X41 — X03) a (12.10b) 


Just as for the single-equation version, the root estimate corresponds to the values of x, and x», 
where fi ;,, and fa ;,, equal zero. For this situation, Eq. (12.10) can be rearranged to give 


Ofii Ofii Ofii ofii 
Ox, H + Ox, itl = -fiit x; “Ox, +X; “Oxy (12.1 1a) 
Of; Of, ofi ofi 
“Ox, “hit + Ox, itl = fyi +x); Ox, +X); Oxy (12.110) 


Because all values subscripted with 7’s are known (they correspond to the latest guess or 
approximation), the only unknowns are x, ,,, and x, ;,,. Thus, Eq. (12.11) is a set of two lin- 
ear equations with two unknowns. Consequently, algebraic manipulations (e.g., Cramer’s 
rule) can be employed to solve for 


Of; ofii 
fii Ox, =h Ox, 
=e ofii Ohi Ofii Oi ene 
Ox, OX, 7 Ox, OX, 
ofii dfai 
Sai =f Ox, 
(12.12b) 


Xi = X25 T 
2,i+1 2,i ofii Ohi Ofii Ohi 


Ox, OX, 7 Ox, OX, 


The denominator of each of these equations is formally referred to as the determinant of 
the Jacobian of the system. 

Equation (12.12) is the two-equation version of the Newton-Raphson method. As 
in the following example, it can be employed iteratively to home in on the roots of two 
simultaneous equations. 


Newton-Raphson for a Nonlinear System 


Problem Statement. Use the multiple-equation Newton-Raphson method to determine 
roots of Eq. (12.6). Initiate the computation with guesses of x, = 1.5 and x, = 3.5. 


Solution. First compute the partial derivatives and evaluate them at the initial guesses of 
x and y: 


Of io fo 

Go ee) tere =a 

ð ð 

Pho _ 4,2 _ 3(3.5)? = 36.75 She iy 6x,xX> = 1 + 6(1.5)(3.5) = 32.5 
Ox, 2 Ox, 
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Thus, the determinant of the Jacobian for the first iteration is 
6.5(32.5) — 1.5(36.75) = 156.125 

The values of the functions can be evaluated at the initial guesses as 
fio = 1.5} + 1.53.5) — 10 = -2.5 
fao = 3.5 + 3(1.5)(3.5)° — 57 = 1.625 


These values can be substituted into Eq. (12.12) to give 
—2.5(32.5) — 1.625(1.5) 


x,=15- 136435 = 2.03603 
La a _ 1.625(6.5) — (-2.5)(36.75) _ 
x, =35 AEE = 2.84388 


Thus, the results are converging to the true values of x, = 2 and x, = 3. The computation 
can be repeated until an acceptable accuracy is obtained. 


When the multiequation Newton-Raphson works, it exhibits the same speedy quadratic 
convergence as the single-equation version. However, just as with successive substitution, 
it can diverge if the initial guesses are not sufficiently close to the true roots. Whereas 
graphical methods could be employed to derive good guesses for the single-equation 
case, no such simple procedure is available for the multiequation version. Although there 
are some advanced approaches for obtaining acceptable first estimates, often the initial 
guesses must be obtained on the basis of trial and error and knowledge of the physical 
system being modeled. 

The two-equation Newton-Raphson approach can be generalized to solve n simultane- 
ous equations. To do this, Eq. (12.11) can be written for the kth equation as 


Ofri Ofri Ofri Ofri fri 
ax Xl i+] Ox, Xai bot Ae Xni = Ski t Xii ax, + Xp; Ox, 
of- 
egg (12.13) 


where the first subscript k represents the equation or unknown and the second subscript 
denotes whether the value or function in question is at the present value (i) or at the 
next value (i + 1). Notice that the only unknowns in Eq. (12.13) are the x, ;,, terms on 
the left-hand side. All other quantities are located at the present value (i) and, thus, are 
known at any iteration. Consequently, the set of equations generally represented by 


Eq. (12.13) i.e., with k = 1, 2,..., n) constitutes a set of linear simultaneous equa- 
tions that can be solved numerically by the elimination methods elaborated in previous 
chapters. 


Matrix notation can be employed to express Eq. (12.13) concisely as 


{rat =- + U Hx (12.14) 
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where the partial derivatives evaluated at i are written as the Jacobian matrix consisting of 
the partial derivatives: 


Hii Ofii = ofii 
Ox; OX, Ox, 
Sri Sai ofai 
VI= Ox; OX _ Ox, oe) 
Ofni Ofni of, i 
Ox; oX Ox,, 


The initial and final values are expressed in vector form as 
Ta 
k = a X e Xal 


and 


T 
{x1} = [xai Xai 07 kaal 


Finally, the function values at i can be expressed as 


[Y = E Pai ane Faal 


Equation (12.14) can be solved using a technique such as Gauss elimination. This 
process can be repeated iteratively to obtain refined estimates in a fashion similar to the 
two-equation case in Example 12.4. 

Insight into the solution can be obtained by solving Eq. (12.14) with matrix inversion. 
Recall that the single-equation version of the Newton-Raphson method is 


Xu = %; Fe (12.16) 


If Eq. (12.14) is solved by multiplying it by the inverse of the Jacobian, the result is 
{xin} = (x) - UTE} (12.17) 


Comparison of Eqs. (12.16) and (12.17) clearly illustrates the parallels between the 
two equations. In essence, the Jacobian is analogous to the derivative of a multivariate 
function. 

Such matrix calculations can be implemented very efficiently in MATLAB. We can 
illustrate this by using MATLAB to duplicate the calculations from Example 12.4. After 
defining the initial guesses, we can compute the Jacobian and the function values as 


>> x=[1.5;3.5]; 
>> J=[2*x(1)+x(2) x(1);3*x(2)42 1+6*x(1)*x(2) ] 


= 
6.5000 1.5000 
36.7500 32.5000 


>> f=[x(1)42+x(1)*x(2) -10;x(2)+3*x(1)*x(2)42-57] 
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TS 
-2.5000 
1.6250 


Then, we can implement Eq. (12.17) to yield the improved estimates 


>> x=x-J\f 
X = 
2.0360 
2.8439 


Although we could continue the iterations in the command mode, a nicer alternative is 
to express the algorithm as an M-file. As in Fig. 12.4, this routine is passed an M-file that 
computes the function values and the Jacobian at a given value of x. It then calls this func- 
tion and implements Eq. (12.17) in an iterative fashion. The routine iterates until an upper 
limit of iterations (maxit) or a specified percent relative error (es) is reached. 


FIGURE 12.4 
MATLAB M-file to implement Newton-Raphson method for nonlinear systems of equations. 


function [x,f,ea, iter ]=newtmult(func,x0,es,maxit,varargin) 
% newtmult: Newton-Raphson root zeroes nonlinear systems 
[x,f,ea, iter] =newtmult(func,x0,es,maxit,p1,p2,...): 
uses the Newton-Raphson method to find the roots of 
a system of nonlinear equations 
input: 
func = name of function that returns f and J 
x0 = initial guess 
es = desired percent relative error (default = 0.0001%) 
maxit = maximum allowable iterations (default = 50) 
p1,p2,... = additional parameters used by function 
output: 
x = vector of roots 
f = vector of functions evaluated at roots 
ea = approximate percent relative error (%) 
iter = number of iterations 


BL ƏL SL L SL SL SL ƏL L L L L L L 


if nargin<2,error('at least 2 input arguments required'),end 
if nargin<3|isempty(es) ,es=0.0001;end 
if nargin<4| isempty(maxit) ,maxit=50;end 
iter = 0; 
x=x0; 
while (1) 

[J,f]=func(x,varargin{:}); 

dx=J\f; 

x=x-dx; 

iter = iter + 1; 

ea=100*max(abs(dx./x)); 

if iter>=maxit|ea<=es, break, end 
end 
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We should note that there are two shortcomings to the foregoing approach. First, 
Eq. (12.15) is sometimes inconvenient to evaluate. Therefore, variations of the Newton- 
Raphson approach have been developed to circumvent this dilemma. As might be expected, 
most are based on using finite-difference approximations for the partial derivatives that 
comprise [J]. The second shortcoming of the multiequation Newton-Raphson method is 
that excellent initial guesses are usually required to ensure convergence. Because these are 
sometimes difficult or inconvenient to obtain, alternative approaches that are slower than 
Newton-Raphson but which have better convergence behavior have been developed. One 
approach is to reformulate the nonlinear system as a single function: 


n 


FO = È Ga 


i=1 
where f, (x), X2, - - - , X„) is the ith member of the original system of Eq. (12.7). The values of 
x that minimize this function also represent the solution of the nonlinear system. Therefore, 
nonlinear optimization techniques can be employed to obtain solutions. 


12.2.3 MATLAB Function: fsolve 


The fsolve function solves systems of nonlinear equations with several variables. A general 
representation of its syntax is 


[x, fx] = fsolve( function, x0, options) 


where [x, fx] = a vector containing the roots x and a vector containing the values of the 
functions evaluated at the roots, function = the name of the function containing a vec- 
tor holding the equations being solved, x0 is a vector holding the initial guesses for the 
unknowns, and options is a data structure created by the optimset function. Note that if 
you desire to pass function parameters but not use the options, pass an empty vector [] 
in its place. 

The optimset function has the syntax 


options = optimset('par,',val,,'par>',Vval>,...) 


where the parameter par; has the value va7;. A complete listing of all the possible param- 
eters can be obtained by merely entering optimset at the command prompt. The parameters 
commonly used with the fsolve function are 


display: When set to 'iter' displays a detailed record of all the iterations. 
tolx: A positive scalar that sets a termination tolerance on x. 


tolfun: A positive scalar that sets a termination tolerance on fx. 
As an example, we can solve the system from Eq. (12.6) 
SQ, x)= 2x, + xx — 10 
F(X, xX) = xy + 3x4, — 57 
First, set up a function to hold the equations 


function f = fun(x) 
f = [x(1)42+x(1)*x(2) -10;x(2)+3*x(1)*x(2)42-57]; 
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A script can then be used to generate the solution, 


clc, format compact 
[x, fx] = fsolve(@fun, [1.5;3.5]) 


with the result 


A. = 
2.0000 
3.0000 
fx = 
1.0e-13 * 
0 
0.1421 


PaE CHEMICAL REACTIONS 


Background. Nonlinear systems of equations occur frequently in the characterization of 
chemical reactions. For example, the following chemical reactions take place in a closed 


system: 
2A+B7?C (12.18) 
A+D?C (12.19) 


At equilibrium, they can be characterized by 


Ce 
==- (12.20) 

Cap 

= Co 
Kye oe (12.21) 


where the nomenclature c; represents the concentration of constituent i. If x, and x, are the 
number of moles of C that are produced due to the first and second reactions, respectively, 
formulate the equilibrium relationships as a pair of two simultaneous nonlinear equations. 
PKA IO K= IT x ic — 00h, — lhc — 5, and c,4— 10) employ fic 
Newton-Raphson method to solve these equations. 


Solution. Using the stoichiometry of Eqs. (12.18) and (12.19), the concentrations of each 
constituent can be represented in terms of x, and x, as 


6, = 6.) = 2G) = 89 (222) 
Ga Gam os (12.23) 
Co = Cog FX + Xy (12.24) 


Ca = Cag ~ Xo (12.25) 
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where the subscript 0 designates the initial concentration of each constituent. These values 
can be substituted into Eqs. (12.20) and (12.21) to give 


(Ceo + x1 +X) 


ik D 
(Cao — 2X1 = Xp) (Cro =x) 


(C..9 +X, +X) 


> (Ca — 2X1 = Xp)(Ca — Xo) 


K, 


Given the parameter values, these are two nonlinear equations with two unknowns. Thus, 
the solution to this problem involves determining the roots of 


Dari SP Sy -4 
Fi Qs x)= -4x10 (12.26) 
(60S te Se aC S55) 
fp X) = meal) -37 x 10x (12.27) 


(50 — 2x, — x,)(10 — x,) 


In order to use Newton-Raphson, we must determine the Jacobian by taking the partial 
derivatives of Eqs. (12.26) and (12.27). Although this is certainly possible, evaluating the 
derivatives is time consuming. An alternative is to represent them by finite differences 
in a fashion similar to the approach used for the modified secant method in Sec. 6.3. For 
example, the partial derivatives comprising the Jacobian can be evaluated as 


ofi _— AQ + 6x), X2) — fi, X2) of _ SiG, X + 6X) — F(X, X2) 
Ono Ox, ax, ôx, 
Of, — Ay + 6x), X2) — fX, X2) ofh _ AM, Xz + OX) — hX, X2) 
a Ox, OX, OX 


These relationships can then be expressed as an M-file to compute both the function 
values and the Jacobian as 


function [J,f]=jfreact(x,varargin) 

del =0.000001; 

df1dx1 = (u(x(1)+del*x(1) ,x(2) )-u(x(1),x(2)))/(del*x(1)); 
dfi1dx2 = (u(x(1) ,x(2)+del*x(2) )-u(x(1),x(2)))/(del*x(2)); 
df2dx1 = (v(x(1)+del*x(1),x(2) )-v(x(1) ,x(2)))/(del*x(1)); 
df2dx2=(v(x(1) ,x(2)+del*x(2) )-v(x(1),x(2)))/(del*x(2)); 
J=[dfldx1 dfidx2;df2dx1 df2dx2]; 

fl=u(x(1),x(2)); 

f2=v(x(1),x(2)); 

f=[f1;f2]; 


function f=u(x,y) 
ve (5 xe wy) 7 (F0 = 28 sy) ® 2 7 (20 = so) = Woe 


function f=v(x,y) 
PS (Sex iy) 7 (G0 = 2 k= A) 7 (Gio) =) = OL Wey 
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The function newtmu1t (Fig. 12.4) can then be employed to determine the roots given initial 
guesses Of x, — x, — 3: 


>>> format short e, x0 =[3; 3]; 

>> [x,f,ea, iter] =newtmult(@jfreact, x0) 
xX = 
3.3366e+000 
2.6772e+000 


Pa 
=7.1286e-017 
8.5973e-014 


Cel S 
5.2237e-010 


iter = 
4 


After four iterations, a solution of x, = 3.3366 and x, = 2.6772 is obtained. These values 
can then be substituted into Eq. (12.22) through (12.25) to compute the equilibrium con- 
centrations of the four constituents: 

c, = 50 — 2(3.3366) — 2.6772 = 40.6496 

C, = 20 — 3.3366 = 16.6634 

c, = 5 + 3.3366 + 2.6772 = 11.0138 

e~ 10 = 202E 2R 


Finally, the fsolve function can also be used to obtain the solution by first writing a 
MATLAB file function to hold the system of nonlinear equations as a vector 


function F=myfun(x) 
F=[(5+x(1)+x(2))/(50-2*x(1)-x(2))^2/(20-x(1))-0.0004;... 
(5+x(1)+x(2))/(50-2*x(1)-x(2))/(10-x(2))-0.037]; 
The solution can then be generated by 
[x,fx] = fsolve(@myfun, [3;3]) 


with the result 


xX = 
33372 
2.6834 
fx = 
1.0e-04 * 
0.0041 
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12.1 Solve the following system using three iterations with 
Gauss-Seidel using overrelaxation (A = 1.25). If necessary, 
rearrange the equations and show all the steps in your solu- 
tion including your error estimates. At the end of the compu- 
tation, compute the true error of your final results. 


3x, + 8x, = 11 
Tx —X,=5 


12.2 (a) Use the Gauss-Seidel method to solve the follow- 
ing system until the percent relative error falls below £, = 5%: 


0.8 —0.4 X 41 
—0.4 08 —0.4 %2 p =4 25 
—0.4 0.8 X3 105 


(b) Repeat (a) but use overrelaxation with À = 1.2. 
12.3 Use the Gauss-Seidel method to solve the following 
system until the percent relative error falls below £, = 5%: 


10x, + 2x,- x;= 27 
—3x, — 6x, + 2x; = —61.5 
X, + X,+5x3;= —21.5 


12.4 Repeat Prob. 12.3 but use Jacobi iteration. 

12.5 The following system of equations is designed to deter- 
mine concentrations (the c’s in g/m?) in a series of coupled 
reactors as a function of the amount of mass input to each 
reactor (the right-hand sides in g/day): 


15c,— 3c,— c= 3800 
—3c,+18c,— 6c, = 1200 


—4c,- c+ 12c, = 2350 


Solve this problem with the Gauss-Seidel method to £, = 5%. 
12.6 Use the Gauss-Seidel method (a) without relaxation 
and (b) with relaxation (A = 1.2) to solve the following sys- 
tem to a tolerance of £, = 5%. If necessary, rearrange the 
equations to achieve convergence. 


2x, — 6x, — x3; = —38 
—3x,- X, + 7x; = —34 
—8x, + xX, — 2x, = —20 


12.7 Of the following three sets of linear equations, identify 
the set(s) that you could not solve using an iterative method 
such as Gauss-Seidel. Show using any number of itera- 
tions that is necessary that your solution does not converge. 
Clearly state your convergence criteria (how you know it is 
not converging). 


Set One Set Two Set Three 


8x +3y+z=13 
—6x + 8z=2 
2x+5y-z=6 


x+y+6z=8 
x+5y-z=5 
4x + 2y-2z=4 


—3x + 4y+5z=6 
—2x + 2y - 3z = -3 
2y-z=1 


12.8 Determine the solution of the simultaneous nonlinear 
equations 


y=-x +x+0.75 
y+5xy=x° 


Use the Newton-Raphson method and employ initial guesses 
of x=y= 1.2. 

12.9 Determine the solution of the simultaneous nonlinear 
equations: 


(a) Graphically. 
(b) Successive substitution using initial guesses of x = 
y=15. 
(c) Newton-Raphson using initial guesses of x = y = 1.5. 
12.10 Figure P12.10 depicts a chemical exchange process 
consisting of a series of reactors in which a gas flowing 
from left to right is passed over a liquid flowing from right 
to left. The transfer of a chemical from the gas into the 
liquid occurs at a rate that is proportional to the difference 
between the gas and liquid concentrations in each reactor. 
At steady state, a mass balance for the first reactor can be 
written for the gas as 


QcCco ~ QCG + PCr ea) = 9 
and for the liquid as 


Qir — QC, + DCG, cn) = 9 


where Qç and Q, are the gas and liquid flow rates, respec- 
tively, and D = the gas-liquid exchange rate. Similar bal- 
ances can be written for the other reactors. Use Gauss-Seidel 
without relaxation to solve for the concentrations given 
the following values: Q, = 2, Q, = 1, D = 0.8, cg) = 100, 
Crp = 10. 
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QG 


QG 


Cgo == Co = Co = Coy = Coy = Cogs = 


FIGURE P12.10 


12.11 The steady-state distribution of temperature on a 
heated plate can be modeled by the Laplace equation: 


0= OT , OT 
ax ay” 
If the plate is represented by a series of nodes (Fig. P12.11), 
centered finite differences can be substituted for the second 
derivatives, which result in a system of linear algebraic 
equations. Use the Gauss-Seidel method to solve for the 
temperatures of the nodes in Fig. P12.11. 
12.12 Develop your own M-file function for the Gauss- 
Seidel method without relaxation based on Fig. 12.2, but 


25°C 


25°C 


75°C 


FIGURE P12.11 


change the first line so that it returns the approximate error 
and the number of iterations: 


function [x,ea,iter] =... 
GaussSeidel(A,b,es, maxit) 


Test it by duplicating Example 12.1 and then use it to solve 
Prob. 12.2a. 

12.13 Develop your own M-file function for Gauss-Seidel 
with relaxation. Here is the function’s first line: 


function [x,ea,iter] =... 
GaussSeidelR(A,b, lambda,es ,maxit) 


In the event that the user does not enter a value for À, set the 
default value as A = 1.Test it by duplicating Example 12.2 
and then use it to solve Prob. 12.2b. 

12.14 Develop your own M-file function for the Newton- 
Raphson method for nonlinear systems of equations based 
on Fig. 12.4. Test it by solving Example 12.4 and then use it 
to solve Prob. 12.8. 

12.15 Determine the roots of the following simultaneous 
nonlinear equations using (a) fixed-point iteration, (b) the 
Newton-Raphson method, and (c) the fsolve function: 


y=-x +x 40.75 y+ 5xy=x° 


Employ initial guesses of x = y = 1.2 and discuss the results. 
12.16 Determine the roots of the simultaneous nonlinear 
equations 


(x-4P +0-47 =5 xr+y=16 


Use a graphical approach to obtain your initial guesses. De- 
termine refined estimates with (a) the two-equation Newton- 
Raphson method and (b) the fsolve function. 
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12.17 Repeat Prob. 12.16 except determine the positive 


root of 
y=x +1 y=2 cosx 


12.18 The following chemical reactions take place in a 
closed system 


2A+B=C 
A+D=C 


At equilibrium, they can be characterized by 


wone = È 
K= 2 K, R 
Cih 


where the nomenclature c; represents the concentration 

of constituent i. If x, and x, are the number of moles of C 

that are produced due to the first and second reactions, re- 

spectively, use an approach to reformulate the equilibrium 

relationships in terms of the initial concentrations of the 

constituents. Then, solve the pair of simultaneous nonlinear 

equations for x, and x, if K, = 4 x 104, K, = 3.7 x 10°, 
Cao = 50, Cy = 20, Coo = 5, and cgo = 10. 

(a) Use a graphical approach to develop your initial guesses. 
Then use these guesses as the starting point to determine 
refined estimates with 

(b) the Newton-Raphson method, and 

(c) the fsolve function. 

12.19 As previously described in Sec. 5.6, the following 

system of five nonlinear equations govern the chemistry of 

rainwater, 


[H*][HCO;] [H*][CO,] 
_ 6 3 _ 3 = + = 
K,=10 Kapo, Č THOA K, = [H*][OH7] 
cr= aP + HCO; ] + (CO; 


= [HCO] + 2 [C0;"] + [OH] + [H*] 


where K,, = Henry’s constant, K,, K,, and K, = equilib- 
rium coefficients, p= total inorganic carbon, [HCO;] = 
bicarbonate, [CO, T= = carbonate, [H*] = hydrogen ion, and 
[OH] = hydroxyl ion. Notice how the partial pressure of 
CO, shows up in the equations indicating the impact of this 
greenhouse gas on the acidity of rain. Use these equations 
and the fsolve function to compute the pH of rainwater 
given that Ky = 107'“°, K, = 10°°°, K, = 107", and K,, = 
107'*. Compare the results in 1958 when the pco, was 315 
and in 2015 when it was about 400 ppm. Note that this is a 
difficult problem to solve because the concentrations tend 
to be very small and vary over many orders of magnitude. 
Therefore, it is useful to use the trick based on express- 
ing the unknowns in a negative log scale, pK = —log,,)(K). 
That is, the five unknowns, cy = total inorganic carbon, = 
bicarbonate, carbonate, [H*], [OH™], [HCO;], [CO, i and 
Cr can be reexpressed as the unknowns pH, pOH, pHCO,, 
pCO, and pe; as in 


[H*] = 107" [OH] = 107°" 
[HCO;] = 107P™C0; 
[CO;7] =10-Ps cp= 101 


In addition, it is helpful to use optimset to set a stringent 
criterion for the function tolerance as in the following script 
which you can use to generate your solutions 


clc, format compact 

xguess = [7;7;3;7;3]; 

options = optimset('tolfun' ,1e-12) 

[x1,fx1] = fsolve(@funpH, xguess, options , 315); 
[x2,fx2] = fsolve(@funpH, xguess, options , 400) ; 
x" ,FXL' x2" fx2" 
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CHAPTER OBJECTIVES 


The primary objective of this chapter is to introduce you to eigenvalues. Specific 
objectives and topics covered are 


Understanding the mathematical definition of eigenvalues and eigenvectors. 
Understanding the physical interpretation of eigenvalues and eigenvectors within 


the context of engineering systems that vibrate or oscillate. 

Knowing how to implement the polynomial method. 

Knowing how to implement the power method to evaluate the largest and smallest 
eigenvalues and their respective eigenvectors. 

Knowing how to use and interpret MATLAB’s eig function. 


YOU’VE GOT A PROBLEM 


t the beginning of Chap. 8, we used Newton’s second law and force balances to pre- 
A the equilibrium positions of three bungee jumpers connected by cords. Because 

we assumed that the cords behaved like ideal springs (i.e., followed Hooke’s law), 
the steady-state solution reduced to solving a system of linear algebraic equations [recall 
Eq. (8.1) and Example 8.2]. In mechanics, this is referred to as a statics problem. 

Now let’s look at a dynamics problem involving the same system. That is, we’ll study 
the jumpers’ motion as a function of time. To do this, their initial conditions (i.e., their 
initial positions and velocities) must be prescribed. For example, we can set the jumpers’ 
initial positions at the equilibrium values computed in Example 8.2. If we then set their ini- 
tial velocities to zero, nothing would happen because the system would be at equilibrium. 

Because we are now interested in examining the system’s dynamics, we must set the 
initial conditions to values that induce motion. Although we set the jumpers’ initial posi- 
tions to the equilibrium values and the middle jumper’s initial velocity to zero, we set the 
upper and bottom jumper’s initial velocities to some admittedly extreme values. That is, we 
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(a) Position (m) versus time (s) 


0 10 20 30 40 50 60 70 80 90 100 


FIGURE 13.1 
The (a) positions and (b) velocities versus time for the system of three interconnected 
bungee jumpers from Example 8.2. 


impose a downward velocity of 200 m/s on jumper 1 and an upward velocity of 100 m/s 
on jumper 3. (Safety tip: Don’t try this at home!) We then used MATLAB to solve the dif- 
ferential equations [Eq. (8.1)] to generate the resulting positions and velocities as a function 
of time.! 

As displayed in Fig. 13.1, the outcome is that the jumpers oscillate wildly. Because 
there are no friction forces (e.g., no air drag or spring dampening), they lurch up and down 
around their equilibrium positions in a persistent manner that at least visually borders on 
the chaotic. Closer inspection of the individual trajectories suggests that there may be some 
pattern to the oscillations. For example, the distances between peaks and troughs might be 
constant. But when viewed as a time series, it is difficult to perceive whether there is any- 
thing systematic and predictable going on. 

In this chapter, we deal with one approach for extracting something fundamental out of 
such seemingly chaotic behavior. This entails determining the eigenvalues, or characteristic 
values, for such systems. As we will see, this involves formulating and solving systems of 
linear algebraic equations in a fashion that differs from what we’ve done to this point. To do 
this, let’s first describe exactly what is meant by eigenvalues from a mathematical standpoint. 


! We will show how this is done when we cover ordinary differential equations in Part Six. 
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MATHEMATICAL BACKGROUND 


Chapters 8 through 12 have dealt with methods for solving sets of linear algebraic equa- 
tions of the general form 


[A]{x} = {b} (13.1) 


Such systems are called nonhomogeneous because of the presence of the vector {b} on 
the right-hand side of the equality. If the equations comprising such a system are linearly 
independent (i.e., have a nonzero determinant), they will have a unique solution. In other 
words, there is one set of x values that will make the equations balance. As we’ve already 
seen in Sec. 9.1.1, for two equations with two unknowns, the solution can be visualized as 
the intersection of two straight lines represented by the equations (recall Fig. 9.1). 

In contrast, a homogeneous linear algebraic system has a right-hand side equal to zero: 


[A]{x} =0 (13.2) 


At face value, this equation suggests that the only possible solution would be the trivial 
case for which all x’s = 0. Graphically this would correspond to two straight lines that 
intersected at zero. 

Although this is certainly true, eigenvalue problems associated with engineering are 
typically of the general form 


[[A] — AL/]] {x} =0 (13.3) 


where the parameter À is the eigenvalue. Thus, rather than setting the x’s to zero, we can 
determine the value of à that drives the left-hand side to zero! One way to accomplish 
this is based on the fact that, for nontrivial solutions to be possible, the determinant of the 
matrix must equal zero: 


[A] — ALZ]|=0 (13.4) 


Expanding the determinant yields a polynomial in A, which is called the characteristic 
polynomial. The roots of this polynomial are the solutions for the eigenvalues. 

In order to better understand these concepts, it is useful to examine the two-equation 
case, 


(a —A)x, + A,X, = 0 
13.5 
X1 + (Ay) — A)X, = 0 (13.9) 
Expanding the determinant of the coefficient matrix gives 
a,;-h ay 2 
7 ree | = — (Ay, + ay )A — Gy 745, (13.6) 


which is the characteristic polynomial. The quadratic formula can then be used to solve 
for the two eigenvalues: 


M (a — an) + Van = dy)” — 4a, 
Mo 2 


(13.7) 


These are the values that solve Eq. (13.5). Before proceeding, let’s convince ourselves that 
this approach (which, by the way, is called the polynomial method) is correct. 
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EXAMPLE 13.1 


The Polynomial Method 


Problem Statement. Use the polynomial method to solve for the eigenvalues of the fol- 
lowing homogeneous system: 


(10 — Xx; ~ 5x, =0 
~5x, + (10 —A)x, =0 


Solution. Before determining the correct solution, let’s first investigate the case where we 
have an incorrect eigenvalue. For example, if à = 3, the equations become 


Tx — 5x, =0 
—5x, + 7x, =0 
Plotting these equations yields two straight lines that intersect at the origin (Fig. 13.2a). 
Thus, the only solution is the trivial case where x, = x, = 0. 


To determine the correct eigenvalues, we can expand the determinant to give the char- 
acteristic polynomial: 


10-rA =3 |_42_ 
| f Ba 20} +75 
which can be solved for 

À 4/20? — 

1_ 20+ 20° — 4(1)75 3155 


dy 2 


Therefore, the eigenvalues for this system are 15 and 5. 
We can now substitute either of these values back into the system and examine the 
result. For 4, = 15, we obtain 


—5x, — 5x, =0 
—5x, — 5x, =0 


Thus, a correct eigenvalue makes the two equations identical (Fig. 13.2b). In essence as 
we move toward a correct eigenvalue the two lines rotate until they lie on top of each 
other. Mathematically, this means that there are an infinite number of solutions. But solving 
either of the equations yields the interesting result that all the solutions have the property 
that x, = —x,. Although at first glance this might appear trivial, it’s actually quite interest- 
ing as it tells us that the ratio of the unknowns is a constant. This result can be expressed in 
vector form as 


_f-l 
i= { 1 } 
which is referred to as the eigenvector corresponding to the eigenvalue à = 15. 


In a similar fashion, substituting the second eigenvalue, à, = 5, gives 


5x, — 5x, =0 
—5x, + 5x, =0 
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(a) Incorrect eigenvalue (b) Correct eigenvalues 


FIGURE 13.2 

Plots of a system of two homogeneous linear equations from Example 13.1. (a) An incorrect ei- 
genvalue (A = 3) means that the two equations, which are labeled as Eqs. 1 and 2 in the figure, 
plot as separate lines and the only solution is the trivial case (x,= x, = 0). (b) In contrast, the 
cases with correct eigenvalues (A = 5 and 15), the equations fall on top of each other. 


Again, the eigenvalue makes the two equations identical (Fig. 13.25) and we can see that 
the solution for this case corresponds to x, = x,, and the eigenvector is 


we) 


We should recognize that MATLAB has built-in functions to facilitate the polynomial 
method. For Example 13.1, the poly function can be used to generate the characteristic 
polynomial as in 


>> A = [10 -5;-5 10]; 
>> p = poly(A) 


Then, the roots function can be employed to compute the eigenvalues: 
>> d = roots(p) 


d= 
15 
5 


The previous example yields the useful mathematical insight that the solution of n 
homogeneous equations of the form of Eq. (13.3) consists of a set of n eigenvalues and their 
associated eigenvectors. Further, it showed that the eigenvectors provide the ratios of the 
unknowns representing the solution. 
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13.2 


In the next section, we will show how such information has utility in engineering and 
science by turning back to our physical problem setting of oscillating objects. However, 
before doing so, we’d like to make two more mathematical points. 

First, inspection of Fig. 13.2b indicates that the straight lines representing each eigen- 
value solution are at right angles to each other. That is, they are orthogonal. This property 
is true for symmetric matrices with distinct eigenvalues. 

Second, multiplying out Eq. (13.3) and separating terms gives 


[A]{x} = A{x} 


When viewed in this way, we can see that solving for the eigenvalues and eigenvectors 
amounts to translating the information content of a matrix [A] into a scalar à. This might 
not seem significant for the 2 x 2 system we have been examining, but it is pretty remark- 
able when we consider that the size of [A] can potentially be much larger. 


PHYSICAL BACKGROUND 


The mass-spring system in Fig. 13.3a is a simple context to illustrate how eigenvalues occur 
in physical problem settings. It will also help to demonstrate some of the mathematical 
concepts introduced in the previous section. 

To simplify the analysis, assume that each mass has no external or damping forces 
acting on it. In addition, assume that each spring has the same natural length / and the same 
spring constant k. Finally, assume that the displacement of each spring is measured rela- 
tive to its own local coordinate system with an origin at the spring’s equilibrium position 
(Fig. 13.3a). Under these assumptions, Newton’s second law can be employed to develop 
a force balance for each mass: 


ot ky 4 xy — x) (13.8a) 


FIGURE 13.3 

A two mass-—three spring system with frictionless rollers vibrating between two fixed walls. 
The position of the masses can be referenced to local coordinates with origins at their 
respective equilibrium positions (a). As in (b), positioning the masses away from equilibrium 
creates forces in the springs that on release lead to oscillations of the masses. 
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d? 
m, —2 = —K(x, — x,) — kx, (13.8b) 


where x, is the displacement of mass i away from its equilibrium position (Fig. 13.3b). 
From vibration theory, it is known that solutions to Eq. (13.8) can take the form 
x, = X, sin(@t) (13.9) 


where X, = the amplitude of the oscillation of mass i (m) and œ = the angular frequency of 
the oscillation (radians/time), which is equal to 


= 2a 
T, 


P 


o (13.10) 
where T, = the period (time/cycle). Note that the inverse of the period is called the ordi- 
nary frequency f (cycles/time). If time is measured in seconds, the unit for fis the cycles/s, 
which is referred to as a Hertz (Hz). 

Equation (13.9) can be differentiated twice and substituted into Eq. (13.8). After col- 
lection of terms, the result is 


2k k 
k 2k 2\ x 


Comparison of Eq. (13.11) with Eq. (13.3) indicates that at this point, the solution has 
been reduced to an eigenvalue problem—where, for this case, the eigenvalue is the square 
of the frequency. For a two-degree-of-freedom system such as Fig. 13.3, there will be two 
such values along with their eigenvectors. As shown in the following example, the latter 
establish the unique relationship between the unknowns. 


Physical Interpretation of Eigenvalues and Eigenvectors 


Problem Statement. If m, = m, = 40 kg and k = 200 N/m, Eq. (13.11) is 


(10 — Nx — Sx, =0 
~5x, + (10 — A)x, = 0 


Mathematically, this is the same system we already solved with the polynomial methods 
in Example 13.2. Thus, the two eigenvalues are œ? = 15 and 5 s~ and the corresponding 
eigenvectors are X, = X, and X, = —X,. Interpret these results as they relate to the mass- 
spring system of Fig. 13.3. 


Solution. This example provides valuable information regarding the behavior of the sys- 
tem in Fig. 13.3. First, it tells us that the system has two primary modes of oscillation with 
angular frequencies of œ = 3.873 and 2.36 radians s~', respectively. These values can be 
also expressed as periods (1.62 and 2.81 s, respectively) or ordinary frequencies (0.6164 
and 0.3559 Hz, respectively). 

As stated in Sec. 13.1, a unique set of values cannot be obtained for the unknown 
amplitudes X. However, their ratios are specified by the eigenvectors. Thus, if the system 
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(a) First mode (b) Second mode 


FIGURE 13.4 
The principal modes of vibration of two equal masses connected by three identical springs 
between fixed walls. 


is vibrating in the first mode, the first eigenvector tells us that the amplitude of the second 
mass will be equal but of opposite sign to the amplitude of the first. As in Fig. 13.4a, the 
masses vibrate apart and then together indefinitely (like two hands clapping every 1.62 s). 

In the second mode, the eigenvector specifies that the two masses have equal amplitudes 
at all times. Thus, as in Fig. 13.4b, they vibrate back and forth in unison every 2.81 s. We 
should note that the configuration of the amplitudes provides guidance on how to set their 
initial values to attain pure motion in either of the two modes. Any other configuration will 
lead to superposition of the modes. It is this superposition that leads to the apparently cha- 
otic behavior of systems like the bungee jumpers in Fig. 13.1. But as this example should 
make clear, there is an underlying systematic behavior that is embodied by the eigenvalues. 


13.3 


THE POWER METHOD 


The power method is an iterative approach that can be employed to determine the largest 
or dominant eigenvalue. With slight modification, it can also be employed to determine the 
smallest value. It has the additional benefit that the corresponding eigenvector is obtained 
as a by-product of the method. To implement the power method, the system being analyzed 
is expressed in the form 


[A] {x} = Mx} (13.12) 
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As illustrated by the following example, Eq. (13.12) forms the basis for an itera- 
tive solution technique that eventually yields the highest eigenvalue and its associated 
eigenvector. 


Power Method for Highest Eigenvalue 


Problem Statement. Using the same approach as in Sec. 13.2, we can derive the follow- 
ing homogeneous set of equations for a three mass—four spring system between two fixed 
walls: 


2k 2 k = 
(7i - o’) xi -m =0 
k 2k 2 Ky _ 
-m Loa )% -m %3 = 9 


If all the masses m = 1 kg and all the spring constants k = 20 N/m, the system can be 
expressed in the matrix format of Eq. (13.4) as 


40 -20 0 
-20 40 -20| -A{/]=0 
0 -20 40 


where the eigenvalue A is the square of the angular frequency œ. Employ the power 
method to determine the highest eigenvalue and its associated eigenvector. 


Solution. The system is first written in the form of Eq. (13.12): 


40X, — 20X, =X, 
~20X, + 40X, — 20X, = ÀX, 
~20X, + 40X; = AX, 


At this point, we can specify initial values of the X’s and use the left-hand side to compute 
an eigenvalue and eigenvector. A good first choice is to assume that all the X’s on the left- 
hand side of the equation are equal to one: 


40(1) — 20(1) = 20 
—20(1) + 40(1) — 20(1) = 0 
— 20(1) + 40(1) = 20 


Next, the right-hand side is normalized by 20 to make the largest element equal to one: 


weet 
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Thus, the normalization factor is our first estimate of the eigenvalue (20) and the correspond- 
ing eigenvector is {1 0 1}". This iteration can be expressed concisely in matrix form as 


40 -20 0 1 20 1 
—20 40 -20 1-7 = 4 0 7 =20< 0 
0 -20 40 1 20 1 


The next iteration consists of multiplying the matrix by the eigenvector from the last itera- 
tion, {1 0 1} to give 


40 -20 0 1 40 1 
-20 40 -20] 40 p= 4 -40 $ =40 < -1 
o -20 40] lu 40 1 


Therefore, the eigenvalue estimate for the second iteration is 40, which can be employed 
to determine an error estimate: 


le, = arian x 100% = 50% 


The process can then be repeated. 


Third iteration: 


40 -20 0 1 60 —0.75 
—20 40 -20 -1 p =< —80 ¢ =—80 1 
0 -20 40 1 60 —0.75 


where |€,| = 150% (which is high because of the sign change). 


Fourth iteration: 


40 -20 0 —0.75 —50 —0.71429 
—20 40 -20 1 =4 70 ¢ =70 1 
0 -20 40 —0.75 —50 —0.71429 


where |€,| = 214% (another sign change). 


Fifth iteration: 


40 -20 0 —0.71429 —48.51714 —0.70833 
—20 40 -20 1 = 4 68.51714 p =68.51714 1 
0 -20 40 —0.71429 —48.51714 —0.70833 


where |€,| = 2.08%. 
Thus, the eigenvalue is converging. After several more iterations, it stabilizes on a 
value of 68.28427 with a corresponding eigenvector of {—0.707107 1 —0.707107}’. 


Note that there are some instances where the power method will converge to the second- 
largest eigenvalue instead of to the largest. James, Smith, and Wolford (1985) provide an 
illustration of such a case. Other special cases are discussed in Fadeev and Fadeeva (1963). 

In addition, there are sometimes cases where we are interested in determining the 
smallest eigenvalue. This can be done by applying the power method to the matrix inverse 
of [A]. For this case, the power method will converge on the largest value of 1/A—in other 
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EXAMPLE 13.4 


words, the smallest value of à. An application to find the smallest eigenvalue will be left 
as a problem exercise. 

Finally, after finding the largest eigenvalue, it is possible to determine the next highest 
by replacing the original matrix by one that includes only the remaining eigenvalues. The 
process of removing the largest known eigenvalue is called deflation. 

We should mention that although the power method can be used to locate intermediate 
values, better methods are available for cases where we need to determine all the eigen- 
values as described in the next section. Thus, the power method is primarily used when we 
want to locate the largest or the smallest eigenvalue. 


MATLAB FUNCTION: eig 


As might be expected, MATLAB has powerful and robust capabilities for evaluating ei- 
genvalues and eigenvectors. The function eig, which is used for this purpose, can be em- 
ployed to generate a vector of the eigenvalues as in 

>> e = eig(A) 
where e is a vector containing the eigenvalues of a square matrix A. Alternatively, it can be 
invoked as 

>> [V,D] = eig(A) 
where D is a diagonal matrix of the eigenvalues and V is a full matrix whose columns are 
the corresponding eigenvectors. 

It should be noted that MATLAB scales the eigenvectors by dividing them by their 
Euclidean distance. Thus, as shown in the following example, although their magnitude 


may be different from values computed with say the polynomial method, the ratio of their 
elements will be identical. 


Eigenvalues and Eigenvectors with MATLAB 


Problem Statement. Use MATLAB to determine all the eigenvalues and eigenvectors for 
the system described in Example 13.3. 


Solution. Recall that the matrix to be analyzed is 
40 -20 0 
—20 40 -20 
0 -20 40 
The matrix can be entered as 
>> A = [40 -20 0;-20 40 -20;0 -20 40]; 
If we just desire the eigenvalues, we can enter 
>> e = eig(A) 


e= 
11.7157 
40.0000 
68 . 2843 


Notice that the highest eigenvalue (68.2843) is consistent with the value previously deter- 
mined with the power method in Example 13.3. 
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If we want both the eigenvalues and eigenvectors, we can enter 
>> [v,d] = eig(A) 


y= 
0.5000 -0.7071 -0.5000 
0.7071 -0.0000 0.7071 
0.5000 0.7071 -0.5000 

d= 
11.7157 0 0 
0 40.0000 0 
0 0 8.2843 


Although the results are scaled differently, the eigenvector corresponding to the high- 
est eigenvalue {—0.5 0.7071 —0.5}" is consistent with the value previously determined 
with the power method in Example 13.3: {—0.707107 1 —0.707107}". The can be demon- 
strated by dividing the eigenvector from the power method by its Euclidean norm: 


>> vpower = [-0.7071 1 -0.7071]'; 
>> vMATLAB = vpower/norm (vpower ) 


VMATLAB = 
-0.5000 
0.7071 
-0.5000 


Thus, although the magnitudes of the elements differ, their ratios are identical. 


LANDE EIGENVALUES AND EARTHQUAKES 


Background. Engineers and scientists use mass-spring models to gain insight into the dy- 
namics of structures under the influence of disturbances such as earthquakes. Figure 13.5 
shows such a model for a three-story building. Each floor mass is represented by m,, and 
each floor stiffness is represented by k; for i = 1 to 3. 


FIGURE 13.5 
A three-story building modeled as a mass-spring system. 
m = 8000 kg 
OOO k; = 1800 kN/m 
m = 10000 kg 
OOO ky = 2400 kN/m 
m, = 12000 kg 
LOT kı = 3000 kN/m 
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13.5 CASE STUDY continued 


For this case, the analysis is limited to horizontal motion of the structure as it is subjected to 
horizontal base motion due to earthquakes. Using the same approach as developed in Sec. 13.2, 
dynamic force balances can be developed for this system as 


he Kp 2 k, 
| m, w: Xi T m, X =0 
kz k, ar k, z k, 
-m Xi | m w3] 2 m %3=0 
k; kz 2 
“m; 2+ i |X = 0 


where X, represent horizontal floor translations (m), and œ, is the natural, or resonant, fre- 
quency (radians/s). The resonant frequency can be expressed in Hertz (cycles/s) by dividing 
it by 2z radians/cycle. 

Use MATLAB to determine the eigenvalues and eigenvectors for this system. Graphi- 
cally represent the modes of vibration for the structure by displaying the amplitudes versus 
height for each of the eigenvectors. Normalize the amplitudes so that the translation of the 
third floor is one. 


Solution. The parameters can be substituted into the force balances to give 


(450 — w?) X; — 200X, = 
—240X, + (420 — œo?) X, — 180X, = 0 
—225X, + (225 — w2)X; = 0 


A MATLAB session can be conducted to evaluate the eigenvalues and eigenvectors as 


>> A=[450 -200 0;-240 420 -180;0 -225 225]; 
>> [v,d]=eig(A) 


v= 

-0.5879 -0.6344 0.2913 

0.7307 -0.3506 0) Bi) 

-0.3471 0.6890 0.7664 
d= 

698.5982 0 0 

ORS 397A779 0 

0 0 5679239 


Therefore, the eigenvalues are 698.6, 339.5, and 56.92 and the resonant frequencies in 
Hz are 
>> wn=sqrt(diag(d))'/2/pi 
wn = 
4.2066 2.9324 1.2008 
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1 -1 0 1 -2 -1 0 1 2 


(0) 
Mode 1 Mode 2 Mode 3 
(o,, = 1.2008 Hz) (@,„ = 2.9324 Hz) (a, = 4.2066 Hz) 
FIGURE 13.6 


The three primary modes of oscillation of the three-story building. 


The corresponding eigenvectors are (normalizing so that the amplitude for the third floor is one) 


1.6934 —0.9207 0.3801 
—2.1049 —0.5088 0.7470 
1 1 1 


A graph can be made showing the three modes (Fig. 13.6). Note that we have ordered 
them from the lowest to the highest natural frequency as is customary in structural 
engineering. 

Natural frequencies and mode shapes are characteristics of structures in terms of 
their tendencies to resonate at these frequencies. The frequency content of an earthquake 
typically has the most energy between 0 and 20 Hz and is influenced by the earthquake 
magnitude, the epicentral distance, and other factors. Rather than a single frequency, they 
contain a spectrum of all frequencies with varying amplitudes. Buildings are more receptive 
to vibration at their lower modes of vibrations due to their simpler deformed shapes and 
requiring less strain energy to deform in the lower modes. When these amplitudes coincide 
with the natural frequencies of buildings, large dynamic responses are induced, creating 
large stresses and strains in the structure’s beams, columns, and foundations. Based on anal- 
yses like the one in this case study, structural engineers can more wisely design buildings to 
withstand earthquakes with a good factor of safety. 
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PROBLEMS 


13.1 Repeat Example 13.1 but for three masses with the 
m’s = 40 kg and the k’s = 240 N/m. Produce a plot like 
Fig. 13.4 to identify the principle modes of vibration. 

13.2 Use the power method to determine the highest eigen- 
value and corresponding eigenvector for 


2-r 8 10 
8 4-A 5 
10 5 7-À 


13.3 Use the power method to determine the lowest eigen- 
value and corresponding eigenvector for the system from 
Prob. 13.2. 

13.4 Derive the set of differential equations for a three mass— 
four spring system (Fig. P13.4) that describes their time mo- 
tion. Write the three differential equations in matrix form 


{Acceleration vector} + [k/m matrix] 
{displacement vector x} = 0 


Note each equation has been divided by the mass. Solve 
for the eigenvalues and natural frequencies for the follow- 
ing values of mass and spring constants: k, = k, = 15 N/m, 
k, = k, = 35 N/m, and m, = m, = m, = 1.5 kg. 

13.5 Consider the mass-spring system in Fig. P13.5. The fre- 
quencies for the mass vibrations can be determined by solving 
for the eigenvalues and by applying Mx + kx = 0, which 


yields 
2k =k =k xı 0 
+4 -k 2k =k yp =4 0 
=k -k 2k X3 0 


iœt 


O m, 0 Xo 
0 0 mj [% 


Applying the guess x = xe 
lowing matrix: 


2k — mo —k =k Xol 0 
=k 2k — mw =k Xo2 e” = fo} 
=k —k 2k — mo X03 0 
ee 
k, ky 


as a solution, we get the fol- 


FIGURE P13.4 


mı TITIR My N m3 
k k 
FIGURE P13.5 
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FIGURE P13.6 


Use MATLAB’s eig command to solve for the eigenvalues 
of the k — ma” matrix above. Then use these eigenvalues to 
solve for the frequencies (œ). Let m, = m, = m, = 1 kg, and 
k=2N/m. 

13.6 As displayed in Fig. P13.6, an LC circuit can be mod- 
eled by the following system of differential equations: 


$ d'ii 1. A 0 
1 dt? + C i ~ i) = 
h Ae il ge eg 
> += G, - 4) - > Gi, - i) =0 
2 C2 Tgh 
d'is | 1 
L,—+—i,-— (i -i,) =0 
d? G i C, ad 
where L = inductance (H), t = time (s), i = current (A), and 


C = capacitance (F). Assuming that a solution is of the form 
i; = I sin (wf), determine the eigenvalues and eigenvectors for 
this system with L = 1 H and C = 0.25C. Draw the network, 
illustrating how the currents oscillate in their primary modes. 
13.7 Repeat Prob. 13.6 but with only two loops. That is, 
omit the 7, loop. Draw the network, illustrating how the cur- 
rents oscillate in their primary modes. 

13.8 Repeat the problem in Sec. 13.5 but leave off the third 
floor. 

13.9 Repeat the problem in Sec. 13.5 but add a fourth floor 
with m, = 6000 and k, = 1200 kN/m. 
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FIGURE P13.10 
(a) A slender rod. (b) A freebody diagram of a rod. 


13.10 The curvature of a slender column subject to an axial 
load P (Fig. P13.10) can be modeled by 


where E = the modulus of elasticity, and J = the moment of 
inertia of the cross section about its neutral axis. 

This model can be converted into an eigenvalue problem 
by substituting a centered finite-difference approximation 
for the second derivative to give 

Vier — Wi tY ip 

=a P Yi 
where i = a node located at a position along the rod’s inte- 
rior, and Ax = the spacing between nodes. This equation can 
be expressed as 


Yi — (2 — AX’ py; + ya = 0 
Writing this equation for a series of interior nodes along the 
axis of the column yields a homogeneous system of equa- 


tions. For example, if the column is divided into five seg- 
ments (i.e., four interior nodes), the result is 


=0 


(2 — Ax’p?) =j 0 0 
=] (2 — Ax2p?) =j 0 
0 =] (2 — Ax’p?) =i 

0 0 -1 (2 — Ax p’) 


An axially loaded wooden column has the following charac- 

teristics: E = 10 x 10° Pa, Z = 1.25 x 107 mî, and L = 3 m. 

For the five-segment, four-node representation: 

(a) Implement the polynomial method with MATLAB to 
determine the eigenvalues for this system. 

(b) Use the MATLAB eig function to determine the eigen- 
values and eigenvectors. 

(c) Use the power method to determine the largest eigen- 
value and its corresponding eigenvector. 

13.11 A system of two homogeneous linear ordinary differ- 

ential equations with constant coefficients can be written as 


dy, 

— = —5y, + 3y,, 
Ji Yı Y2 
Y2 


yı(0) = 50 


= 100y, — 301y», y(0) = 100 


dt 


If you have taken a course in differential equations, you 
know that the solutions for such equations have the form 


y= ce 

where c and À are constants to be determined. Substitut- 
ing this solution and its derivative into the original equa- 
tions converts the system into an eigenvalue problem. The 
resulting eigenvalues and eigenvectors can then be used to 
derive the general solution to the differential equations. For 
example, for the two-equation case, the general solution can 
be written in terms of vectors as 


fy} = afoje T c {vje 


where {v;} = the eigenvector corresponding to the ith eigen- 

value (A,) and the c’s are unknown coefficients that can be 

determined with the initial conditions. 

(a) Convert the system into an eigenvalue problem. 

(b) Use MATLAB to solve for the eigenvalues and 
eigenvectors. 

(c) Employ the results of (b) and the initial conditions to 
determine the general solution. 

(d) Develop a MATLAB plot of the solution for t = 0 
to 1. 

13.12 Water flows between the North American Great 

Lakes as depicted in Fig. P13.12. Based on mass balances, 

the following differential equations can be written for the 

concentrations in each of the lakes for a pollutant that decays 

with first-order kinetics: 


The North American Great Lakes. The arrows indicate how water 
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FIGURE P13.12 
flows between the lakes. 

dc 

Gp = —(0.0056 + Ke, 

dc 

T = —(0.01 + k)c, 

d 

= = 0.01902c, + 0.01387c, — (0.047 + k)c; 

d 

= = 0.33597¢, — (0.376 + Kye, 

d 

a = 0.11364, — (0.133 + bes 


where k = the first-order decay rate (/yr), which is equal to 
0.693 15/(half-life). Note that the constants in each of the 
equations account for the flow between the lakes. Due to 
the testing of nuclear weapons in the atmosphere, the con- 
centrations of strontium-90 (°’Sr) in the five lakes in 1963 
were approximately {c} = {17.7 30.5 43.9 136.3 30.1}7 in 
units of Bq/m*. Assuming that no additional Sr entered the 
system thereafter, use MATLAB and the approach outlined 
in Prob. 13.11 to compute the concentrations in each of the 
lakes from 1963 through 2010. Note that °’Sr has a half-life 
of 28.8 years. 


13.13 Develop an M-file function to determine the largest 
eigenvalue and its associated eigenvector with the power 
method. Test the program by duplicating Example 13.3 and 
then use it to solve Prob. 13.2. 

13.14 Repeat the computations in Sec. 13.5 but remove the 
third floor. 

13.15 Repeat the computations in Sec. 13.5 but add a fourth 
floor with a mass of m, = 6000 kg connected with the third 
floor by a spring with k, = 1200 kN/m. 

13.16 Recall that at the start of Chap. 8, we suspended three 
bungee jumpers with zero drag from frictionless cords that 
followed Hooke’s law. Determine the resulting eigenval- 
ues and eigenvectors that would eventually characterize the 
jumpers’ oscillating motions and relative positions if they 
were instantaneously released from their starting positions 
as depicted in Fig. 8.1a (i.e., with each cord fully extended, 
but not stretched). Although bungee cords do not actually 
behave like true springs, assume that they stretch and com- 
press in linear proportion to the applied force. Use the pa- 
rameters from Example 8.2. 
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Curve Fitting 


OVERVIEW 


What Is Curve Fitting? 


Data are often given for discrete values along a continuum. However, you may require es- 
timates at points between the discrete values. Chapters 14 through 18 describe techniques 
to fit curves to such data to obtain intermediate estimates. In addition, you may require a 
simplified version of a complicated function. One way to do this is to compute values of the 
function at a number of discrete values along the range of interest. Then, a simpler function 
may be derived to fit these values. Both of these applications are known as curve fitting. 

There are two general approaches for curve fitting that are distinguished from each 
other on the basis of the amount of error associated with the data. First, where the data 
exhibit a significant degree of error or “scatter,” the strategy is to derive a single curve that 
represents the general trend of the data. Because any individual data point may be incor- 
rect, we make no effort to intersect every point. Rather, the curve is designed to follow the 
pattern of the points taken as a group. One approach of this nature is called least-squares 
regression (Fig. PT4. 1a). 

Second, where the data are known to be very precise, the basic approach is to fit a curve or 
a series of curves that pass directly through each of the points. Such data usually originate from 
tables. Examples are values for the density of water 
or for the heat capacity of gases as a function of 
temperature. The estimation of values between 
well-known discrete points is called interpolation 
(Fig. PT4.1b and c). 


Curve Fitting and Engineering and Science. 
Your first exposure to curve fitting may have 
been to determine intermediate values from 
tabulated data—for instance, from interest ta- 
bles for engineering economics or from steam 
tables for thermodynamics. Throughout the re- 
mainder of your career, you will have frequent 
occasion to estimate intermediate values from 
such tables. 

Although many of the widely used 


engineering and scientific properties have been 
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FIGURE PT4.1 


Three attempts to fit a “best” curve through five data points: (a) least-squares regression, 
(b) linear interpolation, and (c) curvilinear interpolation. 


tabulated, there are a great many more that are not available in this convenient form. Special 
cases and new problem contexts often require that you measure your own data and develop 
your own predictive relationships. Two types of applications are generally encountered 
when fitting experimental data: trend analysis and hypothesis testing. 

Trend analysis represents the process of using the pattern of the data to make predic- 
tions. For cases where the data are measured with high precision, you might utilize interpo- 
lating polynomials. Imprecise data are often analyzed with least-squares regression. 

Trend analysis may be used to predict or forecast values of the dependent variable. This 
can involve extrapolation beyond the limits of the observed data or interpolation within the 
range of the data. All fields of engineering and science involve problems of this type. 

A second application of experimental curve fitting is hypothesis testing. Here, an 
existing mathematical model is compared with measured data. If the model coefficients 
are unknown, it may be necessary to determine values that best fit the observed data. 
On the other hand, if estimates of the model coefficients are already available, it may be 
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4.2 


appropriate to compare predicted values of the model with observed values to test the ad- 
equacy of the model. Often, alternative models are compared and the “best” one is selected 
on the basis of empirical observations. 

In addition to the foregoing engineering and scientific applications, curve fitting is 
important in other numerical methods such as integration and the approximate solution of 
differential equations. Finally, curve-fitting techniques can be used to derive simple func- 
tions to approximate complicated functions. 


PART ORGANIZATION 


After a brief review of statistics, Chap. 14 focuses on linear regression; that is, how to deter- 
mine the “best” straight line through a set of uncertain data points. Besides discussing how to 
calculate the slope and intercept of this straight line, we also present quantitative and visual 
methods for evaluating the validity of the results. In addition, we describe random number 
generation as well as several approaches for the linearization of nonlinear equations. 

Chapter 15 begins with brief discussions of polynomial and multiple linear regres- 
sion. Polynomial regression deals with developing a best fit of parabolas, cubics, or 
higher-order polynomials. This is followed by a description of multiple linear regres- 
sion, which is designed for the case where the dependent variable y is a linear function 
of two or more independent variables x,, X3, ..., x,,- This approach has special utility for 
evaluating experimental data where the variable of interest is dependent on a number of 
different factors. 

After multiple regression, we illustrate how polynomial and multiple regression are 
both subsets of a general linear least-squares model. Among other things, this will allow us 
to introduce a concise matrix representation of regression and discuss its general statistical 
properties. Finally, the last sections of Chap. 15 are devoted to nonlinear regression. This 
approach is designed to compute a least-squares fit of a nonlinear equation to data. 

Chapter 16 deals with Fourier analysis which involves fitting periodic functions to 
data. Our emphasis will be on the fast Fourier transform or FFT. This method, which is 
readily implemented with MATLAB, has many engineering applications, ranging from 
vibration analysis of structures to signal processing. 

In Chap. 17, the alternative curve-fitting technique called interpolation is described. 
As discussed previously, interpolation is used for estimating intermediate values between 
precise data points. In Chap. 17, polynomials are derived for this purpose. We introduce 
the basic concept of polynomial interpolation by using straight lines and parabolas to con- 
nect points. Then, we develop a generalized procedure for fitting an nth-order polynomial. 
Two formats are presented for expressing these polynomials in equation form. The first, 
called Newton’s interpolating polynomial, is preferable when the appropriate order of the 
polynomial is unknown. The second, called the Lagrange interpolating polynomial, has 
advantages when the proper order is known beforehand. 

Finally, Chap. 18 presents an alternative technique for fitting precise data points. This 
technique, called spline interpolation, fits polynomials to data but in a piecewise fashion. 
As such, it is particularly well suited for fitting data that are generally smooth but exhibit 
abrupt local changes. The chapter ends with an overview of how piecewise interpolation is 
implemented in MATLAB. 
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Linear Regression 


CHAPTER OBJECTIVES 


The primary objective of this chapter is to introduce you to how least-squares 
regression can be used to fit a straight line to measured data. Specific objectives and 
topics covered are 


Familiarizing yourself with some basic descriptive statistics and the normal 
distribution. 

Knowing how to compute the slope and intercept of a best-fit straight line with 
linear regression. 

Knowing how to generate random numbers with MATLAB and how they can be 
employed for Monte Carlo simulations. 

Knowing how to compute and understand the meaning of the coefficient of 
determination and the standard error of the estimate. 

Understanding how to use transformations to linearize nonlinear equations so that 
they can be fit with linear regression. 

Knowing how to implement linear regression with MATLAB. 


YOU’VE GOT A PROBLEM 


n Chap. 1, we noted that a free-falling object such as a bungee jumper is subject to the 
upward force of air resistance. As a first approximation, we assumed that this force was 
proportional to the square of velocity as in 


Fy = cv? (14.1) 


where Fy = the upward force of air resistance [N = kg m/s?], c4 = a drag coefficient (kg/m), 
and v = velocity [m/s]. 

Expressions such as Eq. (14.1) come from the field of fluid mechanics. Although such 
relationships derive in part from theory, experiments play a critical role in their formula- 
tion. One such experiment is depicted in Fig. 14.1. An individual is suspended in a wind 
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FIGURE 14.1 
Wind tunnel experiment to measure how the force of air resistance depends on velocity. 
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Plot of force versus wind velocity for an object suspended in a wind tunnel. 


TABLE 14.1 Experimental data for force (N) and velocity (m/s) from a wind tunnel experiment. 


v, m/s 10 20 30 40 50 60 70 80 
F,N 25 70 380 550 610 1220 830 1450 


tunnel (any volunteers?) and the force measured for various levels of wind velocity. The 
result might be as listed in Table 14.1. 

The relationship can be visualized by plotting force versus velocity. As in Fig. 14.2, 
several features of the relationship bear mention. First, the points indicate that the force 
increases as velocity increases. Second, the points do not increase smoothly, but exhibit 
rather significant scatter, particularly at the higher velocities. Finally, although it may not 
be obvious, the relationship between force and velocity may not be linear. This conclusion 
becomes more apparent if we assume that force is zero for zero velocity. 
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14.1 


In Chaps. 14 and 15, we will explore how to fit a “best” line or curve to such data. In 
so doing, we will illustrate how relationships like Eq. (14.1) arise from experimental data. 


STATISTICS REVIEW 


Before describing least-squares regression, we will first review some basic concepts 
from the field of statistics. These include the mean, standard deviation, residual sum of 
the squares, and the normal distribution. In addition, we describe how simple descriptive 
statistics and distributions can be generated in MATLAB. If you are familiar with these 
subjects, feel free to skip the following pages and proceed directly to Sec. 14.2. If you 
are unfamiliar with these concepts or are in need of a review, the following material is 
designed as a brief introduction. 


14.1.1 Descriptive Statistics 


Suppose that in the course of an engineering study, several measurements were made of a 
particular quantity. For example, Table 14.2 contains 24 readings of the coefficient of ther- 
mal expansion of a structural steel. Taken at face value, the data provide a limited amount 
of information—that is, that the values range from a minimum of 6.395 to a maximum 
of 6.775. Additional insight can be gained by summarizing the data in one or more well- 
chosen statistics that convey as much information as possible about specific characteristics of 
the data set. These descriptive statistics are most often selected to represent (1) the location of 
the center of the distribution of the data and (2) the degree of spread of the data set. 


Measure of Location. The most common measure of central tendency is the arithmetic 
mean. The arithmetic mean (y) of a sample is defined as the sum of the individual data 
points (y,) divided by the number of points (n), or 


y= 2y (14.2) 


where the summation (and all the succeeding summations in this section) is from i = 1 
through n. 

There are several alternatives to the arithmetic mean. The median is the midpoint of a 
group of data. It is calculated by first putting the data in ascending order. If the number of 
measurements is odd, the median is the middle value. If the number is even, it is the arith- 
metic mean of the two middle values. The median is sometimes called the 50th percentile. 

The mode is the value that occurs most frequently. The concept usually has direct util- 
ity only when dealing with discrete or coarsely rounded data. For continuous variables such 
as the data in Table 14.2, the concept is not very practical. For example, there are actually 


TABLE 14.2 Measurements of the coefficient of thermal expansion of structural steel. 


6.495 6.595 6.615 6.635 6.485 6.555 
6.665 6.505 6.435 6.625 6.715 6.655 
6.755 6.625 6.715 6.575 6.655 6.605 


6.565 6.515 6.555 6.395 6.775 6.685 
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four modes for these data: 6.555, 6.625, 6.655, and 6.715, which all occur twice. If the num- 
bers had not been rounded to 3 decimal digits, it would be unlikely that any of the values 
would even have repeated twice. However, if continuous data are grouped into equispaced 
intervals, it can be an informative statistic. We will return to the mode when we describe 
histograms later in this section. 


Measures of Spread. The simplest measure of spread is the range, the difference be- 
tween the largest and the smallest value. Although it is certainly easy to determine, it is not 
considered a very reliable measure because it is highly sensitive to the sample size and is 
very sensitive to extreme values. 

The most common measure of spread for a sample is the standard deviation (s,) about 
the mean: 


=f (14.3) 
y n—-1 


where S, is the total sum of the squares of the residuals between the data points and the 
mean, or 


S =E -y (14.4) 


Thus, if the individual measurements are spread out widely around the mean, S, (and, 
consequently, s,) will be large. If they are grouped tightly, the standard deviation will be 
small. The spread can also be represented by the square of the standard deviation, which 
is called the variance: 
De. uy a y y 


Sy 


P (14.5) 


Note that the denominator in both Eqs. (14.3) and (14.5) is n — 1. The quantity n — 1 is 
referred to as the degrees of freedom. Hence S, and s, are said to be based on n — 1 degrees 
of freedom. This nomenclature derives from the fact that the sum of the quantities upon 
which S, is based (i.e., Y — y,, Y — Yz - - -> Y — Y) is zero. Consequently, if y is known and 
n — 1 of the values are specified, the remaining value is fixed. Thus, only n — 1 of the val- 
ues are said to be freely determined. Another justification for dividing by n — 1 is the fact 
that there is no such thing as the spread of a single data point. For the case where n = 1, 
Eqs. (14.3) and (14.5) yield a meaningless result of infinity. 

We should note that an alternative, more convenient formula is available to compute 
the variance: 

2 Di -= (Ey) /n 


Sy 


PRET (14.6) 


This version does not require precomputation of y and yields an identical result as 
Eq. (14.5). 

A final statistic that has utility in quantifying the spread of data is the coefficient of 
variation (c.v.). This statistic is the ratio of the standard deviation to the mean. As such, it 
provides a normalized measure of the spread. It is often multiplied by 100 so that it can be 
expressed in the form of a percent: 


S, 
c.v. = 7 x 100% (14.7) 
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EXAMPLE 14.1 


Simple Statistics of a Sample 


Problem Statement. Compute the mean, median, variance, standard deviation, and 
coefficient of variation for the data in Table 14.2. 


Solution. The data can be assembled in tabular form and the necessary sums computed as 
in Table 14.3. 
The mean can be computed as [Eq. (14.2)], 


= — 158.4 _ 
aT a 6.6 


Because there are an even number of values, the median is computed as the arithmetic 
mean of the middle two values: (6.605 + 6.615)/2 = 6.61. 

As in Table 14.3, the sum of the squares of the residuals is 0.217000, which can be 
used to compute the standard deviation [Eq. (14.3)]: 


— ,/0.217000 _ 
s =4 54] = 0.097133 


TABLE 14.3 Data and summations for computing simple descriptive statistics for the 
coefficients of thermal expansion from Table 14.2. 


i Yi Qi -7° vi 

1 6.395 0.04203 40.896 
2 6.435 0.02723 41.409 
3 6.485 0.01323 42.055 
4 6.495 0.01103 42.185 
5 6.505 0.00903 42.315 
6 6.515 0.00723 42.445 
7 6.555 0.00203 42.968 
8 6.555 0.00203 42.968 
9 6.565 0.00123 43.099 
10 6.575 0.00063 43.231 
1 6.595 0.00003 43.494 
2 6.605 0.00002 43.626 
3 6.615 0.00022 43.758 
14 6.625 0.00062 43.891 
15 6.625 0.00062 43.891 
16 6.635 0.00122 44.023 
7 6.655 0.00302 44.289 
8 6.655 0.00302 44.289 
9 6.665 0.00422 44.422 
20 6.685 0.00722 44.689 
21 6.715 0.01322 45.091 
22 6.715 0.01322 45.091 
23 6.755 0.02402 45.630 
24 6.775 0.03062 45.901 
X 158.400 0.21700 1045.657 
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the variance [Eq. (14.5)]: 
s = (0.097133)? = 0.009435 
and the coefficient of variation [Eq. (14.7)]: 


c.v. = 1 x 100% = 1.47% 


The validity of Eq. (14.6) can also be verified by computing 


2 = 1045.657 — (158.400)"/24 _ 9) 009435 


aa 24-1 


14.1.2 The Normal Distribution 


Another characteristic that bears on the present discussion is the data distribution—that is, 
the shape with which the data are spread around the mean. A histogram provides a simple 
visual representation of the distribution. A histogram is constructed by sorting the mea- 
surements into intervals, or bins. The units of measurement are plotted on the abscissa and 
the frequency of occurrence of each interval is plotted on the ordinate. 

As an example, a histogram can be created for the data from Table 14.2. The result 
(Fig. 14.3) suggests that most of the data are grouped close to the mean value of 6.6. 
Notice also, that now that we have grouped the data, we can see that the bin with the most 
values is from 6.6 to 6.64. Although we could say that the mode is the midpoint of this 
bin, 6.62, it is more common to report the most frequent range as the modal class interval. 

If we have a very large set of data, the histogram often can be approximated by a 
smooth curve. The symmetric, bell-shaped curve superimposed on Fig. 14.3 is one such 
characteristic shape—the normal distribution. Given enough additional measurements, the 
histogram for this particular case could eventually approach the normal distribution. 


FIGURE 14.3 
A histogram used to depict the distribution of data. As the number of data points increases, 
the histogram often approaches the smooth, bell-shaped curve called the normal distribution. 


Frequency 


6.4 6.6 6.8 
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The concepts of the mean, standard deviation, residual sum of the squares, and nor- 
mal distribution all have great relevance to engineering and science. A very simple exam- 
ple is their use to quantify the confidence that can be ascribed to a particular measurement. 
If a quantity is normally distributed, the range defined by y — s, to y + s, will encompass 
approximately 68% of the total measurements. Similarly, the range defined by y — 2s, 
to y + 2s, will encompass approximately 95%. 

For example, for the data in Table 14.2, we calculated in Example 14.1 that y = 6.6 
and s, = 0.097133. Based on our analysis, we can tentatively make the statement that 
approximately 95% of the readings should fall between 6.405734 and 6.794266. Because it 
is so far outside these bounds, if someone told us that they had measured a value of 7.35, we 
would suspect that the measurement might be erroneous. 


14.1.3 Descriptive Statistics in MATLAB 


Standard MATLAB has several functions to compute descriptive statistics.’ For example, 
the arithmetic mean is computed as mean(x). If xis a vector, the function returns the mean of 
the vector’s values. If it is a matrix, it returns a row vector containing the arithmetic mean 
of each column of x. The following is the result of using mean and the other statistical func- 
tions to analyze a column vector s that holds the data from Table 14.2: 


>> format short g 
>> mean(s),median(s) ,mode(s) 


ans = 
6.6 
ans = 
6.61 
ans = 
6.555 
>> min(s),max(s) 
ans = 
6.395 
ans = 
6.775 
>> range=max(s) -min(s) 
range = 
0.38 
>> var(s),std(s) 
ans = 
0.0094348 
ans = 


0.097133 


These results are consistent with those obtained previously in Example 14.1. Note that 
although there are four values that occur twice, the mode function only returns the first of 
the values: 6.555. 


! MATLAB also offers a Statistics Toolbox that provides a wide range of common statistical tasks, from 
random number generation, to curve fitting, to design of experiments and statistical process control. 
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FIGURE 14.4 
Histogram generated with the MATLAB hist function. 


MATLAB can also be used to generate a histogram based on the hist function. The 
hist function has the syntax 


[n, x] = hist(y, x) 


where n = the number of elements in each bin, x = a vector specifying the midpoint of each 
bin, and y is the vector being analyzed. For the data from Table 14.2, the result is 


>> [n,x] =hist(s) 
n = 
1 1 3 1 4 3 5 2 2 2 
x= 
6.414 6.452 6.49 6.528 6.566 6.604 6.642 6.68 6.718 6.756 

The resulting histogram depicted in Fig. 14.4 is similar to the one we generated by hand in 
Fig. 14.3. Note that all the arguments and outputs with the exception of y are optional. For 
example, hist (y) without output arguments just produces a histogram bar plot with 10 bins 
determined automatically based on the range of values in y. 


RANDOM NUMBERS AND SIMULATION 


In this section, we will describe two MATLAB functions that can be used to produce 
a sequence of random numbers. The first (rand) generates numbers that are uniformly 
distributed, and the second (randn) generates numbers that have a normal distribution. 
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EXAMPLE 14.2 


14.2.1 MATLAB Function: rand 


This function generates a sequence of numbers that are uniformly distributed between 0 
and 1. A simple representation of its syntax is 


= rand(m, n) 


where r = an m-by-n matrix of random numbers. The following formula can then be used to 
generate a uniform distribution on another interval: 


runiform = low + (up - low) * rand(m, n) 


where low = the lower bound and up = the upper bound. 


Generating Uniform Random Values of Drag 


Problem Statement. If the initial velocity is zero, the downward velocity of the free- 
falling bungee jumper can be predicted with the following analytical solution [Eq. (1.9)]: 


Suppose that g = 9.81m/s*, and m = 68.1 kg, but c, is not known precisely. For example, 
you might know that it varies uniformly between 0.225 and 0.275 (i.e., + 10% around 
a mean value of 0.25 kg/m). Use the rand function to generate 1000 random uniformly 
distributed values of c} and then employ these values along with the analytical solution to 
compute the resulting distribution of velocities at t= 4 s. 


Solution. Before generating the random numbers, we can first compute the mean velocity: 


_ , /9.81(68.1) (a Js m 
Gees ye BON | = 8 I 


We can also generate the range: 


_ , [9.81(68.1) | 9.81(0.275) J= m 
Pow = fp 975 tanh Ve 74} = 32.6223 3 
_ , /9.81(68.1) | 9.81(0.225) |= m 
Drign = TEEN) tanh oa 4] = 33.6198 2 


Thus, we can see that the velocity varies by 


— 33.6198 — 32.6223 = 
Av= 233.1118) x 100% = 1.5063% 


The following script generates the random values for c}, along with their mean, standard 
deviation, percent variation, and a histogram: 


clc, format short g 
n=1000;t=4;m=68.1;g=9.81; 
cd=0.25;cdmin=cd-0.025,cdmax=cd+0.025 
r=rand(n,1); 
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cdrand=cdmin+(cdmax-cdmin)*r; 
meancd=mean(cdrand) ,stdcd=std(cdrand) 
Deltacd=(max(cdrand)-min(cdrand) ) /meancd/2*100. 
subplot(2,1,1) 

hist(cdrand) ,title('(a) Distribution of drag') 
xlabel('cd (kg/m) ') 


The results are 


meancd = 
0.25018 
stdcd = 
0.014528 
Deltacd = 
9.9762 


These results, as well as the histogram (Fig. 14.5a) indicate that rand has yielded 1000 
uniformly distributed values with the desired mean value and range. The values can then 
be employed along with the analytical solution to compute the resulting distribution of 
velocities at t = 4 s. 

vrand=sqrt(g*m./cdrand) .*tanh(sqrt(g*cdrand/m)*t) ; 

meanv=mean(vrand) 

Deltav=(max(vrand)-min(vrand) )/meanv/2*100. 

subplot(2,1,2) 

hist(vrand),title('(b) Distribution of velocity') 

xlabel('v (m/s)') 


FIGURE 14.5 
Histograms of (a) uniformly distributed drag coefficients and (b) the resulting distribution of 
velocity. 
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The results are 


meanv = 

33.1151 
Deltav = 

1.5048 


These results, as well as the histogram (Fig. 14.55), closely conform to our hand 
calculations. 


EXAMPLE 14.3 


The foregoing example is formally referred to as a Monte Carlo simulation. The term, 
which is a reference to Monaco’s Monte Carlo casino, was first used by physicists working 
on nuclear weapons projects in the 1940s. Although it yields intuitive results for this simple 
example, there are instances where such computer simulations yield surprising outcomes 
and provide insights that would otherwise be impossible to determine. The approach is fea- 
sible only because of the computer’s ability to implement tedious, repetitive computations 
in an efficient manner. 


14.2.2 MATLAB Function: randn 


This function generates a sequence of numbers that are normally distributed with a mean of 
0 and a standard deviation of 1. A simple representation of its syntax is 


r = randn(m, n) 


where r = an mby-n matrix of random numbers. The following formula can then be used to 
generate a normal distribution with a different mean (mn) and standard deviation (s), 


rnormal = mn + s * randn(m, n) 


Generating Normally Distributed Random Values of Drag 


Problem Statement. Analyze the same case as in Example 14.2, but rather than employ- 
ing a uniform distribution, generate normally distributed drag coefficients with a mean of 
0.25 and a standard deviation of 0.01443. 


Solution. The following script generates the random values for c}, along with their mean, 
standard deviation, coefficient of variation (expressed as a %), and a histogram: 


clc,format short g 
n=1000;t=4;m=68.1;g=9.81; 

cd=0.25; 

stdev=0.01443; 

r=randn(n,1); 

cdrand=cd+stdev*r; 

meancd=mean(cdrand) ,stdevcd=std(cdrand) 
cvcd=stdevcd/meancd*100. 

subplot(2,1,1) 

hist(cdrand) ,title('(a) Distribution of drag') 
xlabel('cd (kg/m) ') 


The results are 


meancd = 
0.24988 
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stdevcd = 
0.014465 
cvcd = 
5.7887 


These results, as well as the histogram (Fig. 14.6a) indicate that randn has yielded 1000 
uniformly distributed values with the desired mean, standard deviation, and coefficient of 
variation. The values can then be employed along with the analytical solution to compute 
the resulting distribution of velocities at t= 4 s. 

vrand=sqrt(g*m./cdrand) .*tanh(sqrt(g*cdrand/m)*t) ; 

meanv=mean(vrand) ,stdevv=std(vrand) 

cvv=stdevv/meanv*100. 

subplot(2,1,2) 

hist(vrand),title('(b) Distribution of velocity') 

xlabel('v (m/s)') 

The results are 


meanv = 

33.117 
stdew = 

0.28839 
cw = 

0.8708 
These results, as well as the histogram (Fig. 14.6b), indicate that the velocities are also 
normally distributed with a mean that is close to the value that would be computed using 
the mean and the analytical solution. In addition, we compute the associated standard de- 
viation which corresponds to a coefficient of variation of +0.8708%. 


FIGURE 14.6 
Histograms of (a) normally distributed drag coefficients and (b) the resulting distribution of velocity. 
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14.3 


Although simple, the foregoing examples illustrate how random numbers can be easily gen- 
erated within MATLAB. We will explore additional applications in the end-of-chapter problems. 


LINEAR LEAST-SQUARES REGRESSION 


Where substantial error is associated with data, the best curve-fitting strategy is to derive an 
approximating function that fits the shape or general trend of the data without necessarily 
matching the individual points. One approach to do this is to visually inspect the plotted 
data and then sketch a “best” line through the points. Although such “eyeball” approaches 
have commonsense appeal and are valid for “back-of-the-envelope” calculations, they are 
deficient because they are arbitrary. That is, unless the points define a perfect straight line 
(in which case, interpolation would be appropriate), different analysts would draw differ- 
ent lines. 

To remove this subjectivity, some criterion must be devised to establish a basis for 
the fit. One way to do this is to derive a curve that minimizes the discrepancy between the 
data points and the curve. To do this, we must first quantify the discrepancy. The simplest 
example is fitting a straight line to a set of paired observations: (x,, y1), (%, Yo), <- -s (Xp Yn): 
The mathematical expression for the straight line is 


yHataxt+e (14.8) 


where ay and a, are coefficients representing the intercept and the slope, respectively, and e 
is the error, or residual, between the model and the observations, which can be represented 
by rearranging Eq. (14.8) as 


e =y- a- a,x (14.9) 


Thus, the residual is the discrepancy between the true value of y and the approximate value, 
dy + a,x, predicted by the linear equation. 


14.3.1 Criteria for a “Best” Fit 


One strategy for fitting a “best” line through the data would be to minimize the sum of the 
residual errors for all the available data, as in 


n n 
pa e= È O; — ay — a4X;) (14.10) 
i=1 


i=l 
where n = total number of points. However, this is an inadequate criterion, as illustrated by 
Fig. 14.7a, which depicts the fit of a straight line to two points. Obviously, the best fit is the 
line connecting the points. However, any straight line passing through the midpoint of the 
connecting line (except a perfectly vertical line) results in a minimum value of Eq. (14.10) 
equal to zero because positive and negative errors cancel. 

One way to remove the effect of the signs might be to minimize the sum of the absolute 
values of the discrepancies, as in 


n n 
2 leil = 2, [Yi Z 4 — 2)%;| (14.11) 
i=1 i=1 
Figure 14.7b demonstrates why this criterion is also inadequate. For the four points shown, 


any straight line falling within the dashed lines will minimize the sum of the absolute val- 
ues of the residuals. Thus, this criterion also does not yield a unique best fit. 
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Examples of some criteria for “best fit” that are inadequate for regression: (a) minimizes 
the sum of the residuals, (b) minimizes the sum of the absolute values of the residuals, and 
(c) minimizes the maximum error of any individual point. 


A third strategy for fitting a best line is the minimax criterion. In this technique, the 
line is chosen that minimizes the maximum distance that an individual point falls from 
the line. As depicted in Fig. 14.7c, this strategy is ill-suited for regression because it gives 
undue influence to an outlier—that is, a single point with a large error. It should be noted 
that the minimax principle is sometimes well-suited for fitting a simple function to a com- 
plicated function (Carnahan, Luther, and Wilkes, 1969). 

A strategy that overcomes the shortcomings of the aforementioned approaches is to 
minimize the sum of the squares of the residuals: 


n n 
S.= e= È O- dy — ax)? (14.12) 
i=1 i=l 
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EXAMPLE 14.4 


This criterion, which is called least squares, has a number of advantages, including that it 
yields a unique line for a given set of data. Before discussing these properties, we will pres- 
ent a technique for determining the values of a, and a, that minimize Eq. (14.12). 


14.3.2 Least-Squares Fit of a Straight Line 


To determine values for ay and a,, Eq. (14.12) is differentiated with respect to each 
unknown coefficient: 


oS, 

da, = -2f (y; — dy) — 4)X;) 
oS, 

ða, = -25 [(y; — ap — a,x;) x;] 


Note that we have simplified the summation symbols; unless otherwise indicated, all sum- 
mations are from i = | to n. Setting these derivatives equal to zero will result in a minimum 
S,. If this is done, the equations can be expressed as 


0= Èy- Èa- Lax, 
0= Dx; -= Lax; -= Bax 


Now, realizing that }) ay = nay we can express the equations as a set of two simultaneous 
linear equations with two unknowns (dy and a,): 


n at+(dxJa=Xy; (14.13) 


(Dx) do + (Yai) a = Ex; (14.14) 


These are called the normal equations. They can be solved simultaneously for 


= n xy; 7 Èx Èy: 


ay ; 5 (14.15) 
nyt = (2x) 
This result can then be used in conjunction with Eq. (14.13) to solve for 
aj =y-— ax (14.16) 


where y and y are the means of y and x, respectively. 


Linear Regression 
Problem Statement. Fit a straight line to the values in Table 14.1. 


Solution. In this application, force is the dependent variable (y) and velocity is the 
independent variable (x). The data can be set up in tabular form and the necessary sums 
computed as in Table 14.4. 
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TABLE 14.4 Data and summations needed to compute the best-fit line for the data 
from Table 14.1. 


i Xi Yi x Xi 
1 10 25 100 250 
2 20 70 400 1,400 
3 30 380 900 11,400 
4 40 550 1,600 22,000 
5 50 610 2,500 30,500 
6 60 1,220 3,600 73,200 
7 70 830 4,900 58,100 
8 80 1,450 6,400 116,000 
> 360 5,135 20,400 312,850 


The means can be computed as 

t= 20 =45 ja ae = 641.875 
The slope and the intercept can then be calculated with Eqs. (14.15) and (14.16) as 
= 112850) = 305199 L 19.47024 


dy = 641.875 — 19.47024(45) = —234.2857 


ay; 


Using force and velocity in place of y and x, the least-squares fit is 
F = —234.2857 + 19.470240 


The line, along with the data, is shown in Fig. 14.8. 


FIGURE 14.8 
Least-squares fit of a straight line to the data from Table 14.1 
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Notice that although the line fits the data well, the zero intercept means that the 
equation predicts physically unrealistic negative forces at low velocities. In Sec. 14.4, we 
will show how transformations can be employed to derive an alternative best-fit line that is 
more physically realistic. 


14.3.3 Quantification of Error of Linear Regression 


Any line other than the one computed in Example 14.4 results in a larger sum of the squares 
of the residuals. Thus, the line is unique and in terms of our chosen criterion is a “best” 
line through the points. A number of additional properties of this fit can be elucidated by 
examining more closely the way in which residuals were computed. Recall that the sum of 
the squares is defined as [Eq. (14.12)] 


n 
S,= È Oi- a — ax) (14.17) 


i=l 


Notice the similarity between this equation and Eq. (14.4) 


S=20 -y (14.18) 


In Eq. (14.18), the square of the residual represented the square of the discrepancy between 
the data and a single estimate of the measure of central tendency—the mean. In Eq. (14.17), 
the square of the residual represents the square of the vertical distance between the data and 
another measure of central tendency—the straight line (Fig. 14.9). 

The analogy can be extended further for cases where (1) the spread of the points 
around the line is of similar magnitude along the entire range of the data and (2) the distri- 
bution of these points about the line is normal. It can be demonstrated that if these criteria 


FIGURE 14.9 
The residual in linear regression represents the vertical distance between a data point and 
the straight line. 
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are met, least-squares regression will provide the best (i.e., the most likely) estimates of ay 
and a, (Draper and Smith, 1981). This is called the maximum likelihood principle in statis- 
tics. In addition, if these criteria are met, a “standard deviation” for the regression line can 
be determined as [compare with Eq. (14.3)] 


S 


s= Va (14.19) 


where s,/, is called the standard error of the estimate. The subscript notation “y/x” des- 
ignates that the error is for a predicted value of y corresponding to a particular value of 
x. Also, notice that we now divide by n — 2 because two data-derived estimates—ay and 
a,—were used to compute S,; thus, we have lost two degrees of freedom. As with our dis- 
cussion of the standard deviation, another justification for dividing by n — 2 is that there is 
no such thing as the “spread of data” around a straight line connecting two points. Thus, for 
the case where n = 2, Eq. (14.19) yields a meaningless result of infinity. 

Just as was the case with the standard deviation, the standard error of the estimate 
quantifies the spread of the data. However, s,,, quantifies the spread around the regres- 
sion line as shown in Fig. 14.10 in contrast to the standard deviation s, that quantified the 
spread around the mean (Fig. 14.10a). 

These concepts can be used to quantify the “goodness” of our fit. This is particularly 
useful for comparison of several regressions (Fig. 14.11). To do this, we return to the 
original data and determine the total sum of the squares around the mean for the dependent 
variable (in our case, y). As was the case for Eq. (14.18), this quantity is designated S, 
This is the magnitude of the residual error associated with the dependent variable prior to 
regression. After performing the regression, we can compute S, the sum of the squares of 
the residuals around the regression line with Eq. (14.17). This characterizes the residual 


FIGURE 14.10 

Regression data showing (a) the spread of the data around the mean of the dependent 
variable and (b) the spread of the data around the best-fit line. The reduction in the spread 
in going from (a) to (b), as indicated by the bell-shaped curves at the right, represents th 
improvement due to linear regression. 
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FIGURE 14.11 
Examples of linear regression with (a) small and (b) large residual errors. 


error that remains after the regression. It is, therefore, sometimes called the unexplained 
sum of the squares. The difference between the two quantities, S, — S,, quantifies the im- 
provement or error reduction due to describing the data in terms of a straight line rather 
than as an average value. Because the magnitude of this quantity is scale-dependent, the 
difference is normalized to S, to yield 


2_5,—S, 
a 
where 7^ is called the coefficient of determination and r is the correlation coefficient 
(= Vr’). For a perfect fit, S, = 0 and r* = 1, signifying that the line explains 100% of 
the variability of the data. For r° = 0 S, = S, and the fit represents no improvement. An 
alternative formulation for r that is more convenient for computer implementation is 


n> (x;y;) — (Èx) (dvi) 


Z (14.21) 


nda - (EaP ynEy - (Ey? 


Estimation of Errors for the Linear Least-Squares Fit 


(14.20) 


2 


Problem Statement. Compute the total standard deviation, the standard error of the esti- 
mate, and the correlation coefficient for the fit in Example 14.4. 


Solution. The data can be set up in tabular form and the necessary sums computed as in 
Table 14.5. 
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TABLE 14.5 Data and summations needed to compute the goodness-of-fit statistics 
for the data from Table 14.1. 


l Xi Yi ao + 4X; 0- O; — ay — ax)? 
1 10 25 —39.58 380,535 4,171 
2 20 70 155.12 327,041 7,245 
3 30 380 349.82 68,579 911 
4 40 550 544.52 8,441 30 
5 50 610 739.23 1,016 16,699 
6 60 1,220 933.93 334,229 81,837 
7 70 830 1,128.63 35,391 89,180 
8 80 1,450 1,323.33 653,066 16,044 
x 360 5,135 1,808,297 216,118 


The standard deviation is [Eq. (14.3)] 


s = yE - 508.2 


and the standard error of the estimate is [Eq. (14.19)] 


an 8-2 7 


Thus, because SyJx < Sy, the linear regression model has merit. The extent of the improve- 
ment is quantified by [Eq. (14.20)] 


2 _ 1,808,297 — 216,118 


1,808,297 = 0,8809 


or r= V 0.8805 = 0.9383 These results indicate that 88.05% of the original uncertainty 
has been explained by the linear model. 


Before proceeding, a word of caution is in order. Although the coefficient of determi- 
nation provides a handy measure of goodness-of-fit, you should be careful not to ascribe 
more meaning to it than is warranted. Just because r? is “close” to 1 does not mean that 
the fit is necessarily “good.” For example, it is possible to obtain a relatively high value of 
r?° when the underlying relationship between y and x is not even linear. Draper and Smith 
(1981) provide guidance and additional material regarding assessment of results for linear 
regression. In addition, at the minimum, you should always inspect a plot of the data along 
with your regression curve. 

A nice example was developed by Anscombe (1973). As in Fig. 14.12, he came up with 
four data sets consisting of 11 data points each. Although their graphs are very different, all 
have the same best-fit equation, y = 3 + 0.5x, and the same coefficient of determination, 
r? = 0.67! This example dramatically illustrates why developing plots is so valuable. 
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Anscombe’s four data sets along with the best-fit line, y = 3 + 0.5x. 


LINEARIZATION OF NONLINEAR RELATIONSHIPS 


Linear regression provides a powerful technique for fitting a best line to data. However, 
it is predicated on the fact that the relationship between the dependent and independent 
variables is linear. This is not always the case, and the first step in any regression analysis 
should be to plot and visually inspect the data to ascertain whether a linear model applies. 
In some cases, techniques such as polynomial regression, which is described in Chap. 15, 
are appropriate. For others, transformations can be used to express the data in a form that 
is compatible with linear regression. 
One example is the exponential model: 


y= aeh” (14.22) 


where a, and p, are constants. This model is used in many fields of engineering and sci- 
ence to characterize quantities that increase (positive /,) or decrease (negative /,) at a 
rate that is directly proportional to their own magnitude. For example, population growth 
or radioactive decay can exhibit such behavior. As depicted in Fig. 14.13a, the equation 
represents a nonlinear relationship (for J; # 0) between y and x. 

Another example of a nonlinear model is the simple power equation: 


your (14.23) 


where a, and p, are constant coefficients. This model has wide applicability in all fields 
of engineering and science. It is very frequently used to fit experimental data when the 
underlying model is not known. As depicted in Fig. 14.13, the equation (for 2, # 0) is 
nonlinear. 
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FIGURE 14.13 
(a) The exponential equation, (b) the power equation, and (c) the saturation-growth-rate equation. Parts (d), (e), and 
(f) are linearized versions of these equations that result from simple transformations. 


A third example of a nonlinear model is the saturation-growth-rate equation: 


x 
y=% AT (14.24) 
where a, and /, are constant coefficients. This model, which is particularly well-suited 
for characterizing population growth rate under limiting conditions, also represents a 
nonlinear relationship between y and x (Fig. 14.13c) that levels off, or “saturates,” as 
x increases. It has many applications, particularly in biologically related areas of both 
engineering and science. 

Nonlinear regression techniques are available to fit these equations to experimental 
data directly. However, a simpler alternative is to use mathematical manipulations to trans- 
form the equations into a linear form. Then linear regression can be employed to fit the 
equations to data. 
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For example, Eq. (14.22) can be linearized by taking its natural logarithm to yield 

Iny=Ina,+ px (14.25) 
Thus, a plot of In y versus x will yield a straight line with a slope of J, and an intercept of 
In a, (Fig. 14.13d). 

Equation (14.23) is linearized by taking its base-10 logarithm to give 

log y = log a, + p, log x (14.26) 
Thus, a plot of log y versus log x will yield a straight line with a slope of p, and an intercept 
of log a, (Fig. 14.13e). Note that any base logarithm can be used to linearize this model. 
However, as done here, the base-10 logarithm is most commonly employed. 

Equation (14.24) is linearized by inverting it to give 

1_1_ l 

yates (14.27) 
Thus, a plot of 1/y versus 1/x will be linear, with a slope of £,/a, and an intercept of 1/a, 
(Fig. 14.13f). 

In their transformed forms, these models can be fit with linear regression to evaluate 
the constant coefficients. They can then be transformed back to their original state and 
used for predictive purposes. The following illustrates this procedure for the power model. 

EXAMPLE 14.6 Fitting Data with the Power Equation 


Problem Statement. Fit Eq. (14.23) to the data in Table 14.1 using a logarithmic 
transformation. 


Solution. The data can be set up in tabular form and the necessary sums computed as in 
Table 14.6. 
The means can be computed as 


x= 12.606 =15757 y= 20515 -2.5644 


TABLE 14.6 Data and summations needed to fit the power model to the data from 


Table 14.1 
i x; yi log x; log y; (log x)? log x; log y; 
1 10 25 1.000 1.398 1.000 1.398 
2 20 70 1.301 1.845 1.693 2.401 
3 30 380 1.477 2.580 2.182 3.811 
4 40 550 1.602 2.740 2.567 4.390 
5 50 610 1.699 2.785 2.886 4.732 
6 60 1220 1.778 3.086 3.162 5.488 
7 70 830 1.845 2.919 3.404 5.386 
8 80 1450 1.903 3.161 3.622 6.016 
x 12.606 20.515 20.516 33.622 
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FIGURE 14.14 
Least-squares fit of a power model to the data from Table 14.1. (a) The fit of the transformed 
data. (b) The power equation fit along with the data. 


The slope and the intercept can then be calculated with Eqs. (14.15) and (14.16) as 


_ 8(33.622) — 12.606(20.515) _ 
8(20.516) — (12.606) 


dy = 2.5644 — 1.9842(1.5757) = —0.5620 


1.9842 


ay 


The least-squares fit is 
log y = —0.5620 + 1.9842 log x 


The fit, along with the data, is shown in Fig. 14.14a. 


370 


LINEAR REGRESSION 


We can also display the fit using the untransformed coordinates. To do this, the coef- 
ficients of the power model are determined as a, = 10-°°° = 0.2741 and p, = 1.9842. 
Using force and velocity in place of y and x, the least-squares fit is 


F = 0.274118 


This equation, along with the data, is shown in Fig. 14.14). 


14.5 


The fits in Example 14.6 (Fig. 14.14) should be compared with the one obtained 
previously in Example 14.4 (Fig. 14.8) using linear regression on the untransformed data. 
Although both results would appear to be acceptable, the transformed result has the advan- 
tage that it does not yield negative force predictions at low velocities. Further, it is known 
from the discipline of fluid mechanics that the drag force on an object moving through a 
fluid is often well described by a model with velocity squared. Thus, knowledge from the 
field you are studying often has a large bearing on the choice of the appropriate model 
equation you use for curve fitting. 


14.4.1 General Comments on Linear Regression 


Before proceeding to curvilinear and multiple linear regression, we must emphasize the 
introductory nature of the foregoing material on linear regression. We have focused on 
the simple derivation and practical use of equations to fit data. You should be cognizant 
of the fact that there are theoretical aspects of regression that are of practical importance 
but are beyond the scope of this book. For example, some statistical assumptions that are 
inherent in the linear least-squares procedures are 


1. Each x has a fixed value; it is not random and is known without error. 
2. The y values are independent random variables and all have the same variance. 
3. The y values for a given x must be normally distributed. 


Such assumptions are relevant to the proper derivation and use of regression. For 
example, the first assumption means that (1) the x values must be error-free and (2) the 
regression of y versus x is not the same as x versus y. You are urged to consult other refer- 
ences such as Draper and Smith (1981) to appreciate aspects and nuances of regression that 
are beyond the scope of this book. 


COMPUTER APPLICATIONS 


Linear regression is so commonplace that it can be implemented on most pocket calcula- 
tors. In this section, we will show how a simple M-file can be developed to determine the 
slope and intercept as well as to create a plot of the data and the best-fit line. We will also 
show how linear regression can be implemented with the built-in polyfit function. 
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14.5.1 MATLAB M-file: linregr 


An algorithm for linear regression can be easily developed (Fig. 14.15). The required 
summations are readily computed with MATLAB’s sum function. These are then used to 
compute the slope and the intercept with Eqs. (14.15) and (14.16). The routine displays the 
intercept and slope, the coefficient of determination, and a plot of the best-fit line along 
with the measurements. 

A simple example of the use of this M-file would be to fit the force-velocity data analyzed 
in Example 14.4: 


>> x = [10 20 30 40 50 60 70 80]; 
>> y = [25 70 380 550 610 1220 830 1450]; 
>> [a, r2] = linregr (x,y) 


a = 
19.4702 -234.2857 


r2 = 
0.8805 


1600 


1400 


1200 O 


1000 


10 20 30 40 50 60 70 80 


It can just as easily be used to fit the power model (Example 14.6) by applying the 
10g10 function to the data as in 


>> [a, r2] = linregr(10g10(x) , 1og10(y) ) 


a = 
1.9842 -0.5620 


r2 = 
0.9481 
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FIGURE 1415 
An M-file to implement linear regression. 


function [a, r2] = linregr(x,y) 
% linregr: linear regression curve fitting 


% [a, r2] = linregr(x,y): Least squares fit of straight 
% line to data by solving the normal equations 
% input: 


% x= independent variable 

% y = dependent variable 

% output: 

% a= vector of slope, a(1), and intercept, a(2) 
% r2 = coefficient of determination 


n = length(x); 

if length(y)~=n, error('x and y must be same length'); end 
X= xX(:); y=y(:); % convert to column vectors 

sx = sum(x); sy = sum(y); 

sx2 = sum(x.*x); sxy = sum(x.*y); sy2 = sum(y.*y); 

a(1) = (n*sxy-sx*sy)/(n*sx2-sx2) ; 

a(2) = sy/n—a(1)*sx/n; 

r2 = ((n*sxy—sx*sy) /sqrt(n*sx2-sx42)/sqrt(n*sy2—-sy2) )42; 
% create plot of data and best fit line 

xp = linspace(min(x) ,max(x) ,2); 

yp = a(1)*xp+a(2); 

plot(x,y,'0',xp,yp) 

grid on 
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14.5.2 MATLAB Functions: polyfit and polyval 


MATLAB has a built-in function polyfit that fits a least-squares nth-order polynomial to 
data. It can be applied as in 


>> p= polyfit(x, y, n) 


where x and y are the vectors of the independent and the dependent variables, respectively, 
and n = the order of the polynomial. The function returns a vector p containing the poly- 
nomial’s coefficients. We should note that it represents the polynomial using decreasing 
powers of x as in the following representation: 


f@) = px" + pyx" ae + PpX + Prat 


Because a straight line is a first-order polynomial, polyfit(x,y,1) will return the slope 
and the intercept of the best-fit straight line. 


>> x = [10 20 30 40 50 60 70 80]; 


>> y = [25 70 380 550 610 1220 830 1450]; 
>> a = polyfit(x,y,1) 
a = 


19.4702 -234.2857 


Thus, the slope is 19.4702 and the intercept is —234.2857. 
Another function, polyval, can then be used to compute a value using the coefficients. 
It has the general format: 


>> y = polyval(p, x) 
where p = the polynomial coefficients, and y = the best-fit value at x. For example, 


>> y = polyval(a,45) 


641.8750 


LERN NDE ENZYME KINETICS 


Background. Enzymes act as catalysts to speed up the rate of chemical reactions in living 
cells. In most cases, they convert one chemical, the substrate, into another, the product. The 
Michaelis-Menten equation is commonly used to describe such reactions: 


TE v,, 5] 
ki + [5] 


(14.28) 


where v = the initial reaction velocity, v,, = the maximum initial reaction velocity, [S] = 
substrate concentration, and k, = a half-saturation constant. As in Fig. 14.16, the equation 
describes a saturating relationship which levels off with increasing [S]. The graph also 
illustrates that the half-saturation constant corresponds to the substrate concentration at 
which the velocity is half the maximum. 
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FIGURE 14.16 
Two versions of the Michaelis-Menten model of enzyme kinetics. 


Although the Michaelis-Menten model provides a nice starting point, it has been 
refined and extended to incorporate additional features of enzyme kinetics. One simple 
extension involves so-called allosteric enzymes, where the binding of a substrate molecule 
at one site leads to enhanced binding of subsequent molecules at other sites. For cases with 
two interacting bonding sites, the following second-order version often results in a better fit: 

2 
= ~n (14.29) 
ks + [S] 
This model also describes a saturating curve but, as depicted in Fig. 14.16, the squared 
concentrations tend to make the shape more sigmoid, or S-shaped. 
Suppose that you are provided with the following data: 


[S] oS 18 3 4.5 6 8 9 
v 0.07 Oras 0,22 OR 75 OF335 O88) 0.36 


Employ linear regression to fit these data with linearized versions of Eqs. (14.28) and 
(14.29). Aside from estimating the model parameters, assess the validity of the fits with 
both statistical measures and graphs. 


Solution. Equation (14.28), which is in the format of the saturation-growth-rate model 
(Eq. 14.24), can be linearized by inverting it to give [recall Eq. (14.27)] 
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14.6 CASE STUDY continued 


The linregr function from Fig. 14.15 can then be used to determine the least-squares fit: 


SS Salil 8) alt) s} 41s a & Sills 
>> v=[0.07 0.13 0.22 0.275 0.335 0.35 0.36]; 
>> [a,r2]=linregr(1./S,1./v) 


a 


16.4022 0.1902 
r2 = 
0.9344 


The model coefficients can then be calculated as 
>> vm=1/a(2) 
vm = 
5.2570 
>> ks=vm*a(1) 
ks = 
86.2260 


Thus, the best-fit model is 


5 = —5:2570151 
~ 86.2260 + [S] 


Although the high value of r° might lead you to believe that this result is accept- 
able, inspection of the coefficients might raise doubts. For example, the maximum velocity 
(5.2570) is much greater than the highest observed velocity (0.36). In addition, the half- 
saturation rate (86.2260) is much bigger than the maximum substrate concentration (9). 

The problem is underscored when the fit is plotted along with the data. Figure 14.17a 
shows the transformed version. Although the straight line follows the upward trend, the 
data clearly appear to be curved. When the original equation is plotted along with the data 
in the untransformed version (Fig. 14.17b), the fit is obviously unacceptable. The data 
are clearly leveling off at about 0.36 or 0.37. If this is correct, an eyeball estimate would 
suggest that v,, should be about 0.36, and k, should be in the range of 2 to 3. 

Beyond the visual evidence, the poorness of the fit is also reflected by statistics like 
the coefficient of determination. For the untransformed case, a much less acceptable result 
of 7” = 0.6406 is obtained. 

The foregoing analysis can be repeated for the second-order model. Equation (14.28) 
can also be linearized by inverting it to give 

ks 
cP iT 
The linregr function from Fig. 14.15 can again be used to determine the least-squares fit: 
>> [a,r2]=linregr(1./S.%2,1./v) 
a 


wai 
OD” o 


19.3760 2.4492 
r2 = 
0.9929 
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14.6 CASE STUDY continued 


(a) Transformed model 


0.1 02 0.3 0.4 0.5 0.6 0.7 0.8 
1/[S] 


(b) Original model 


FIGURE 14.17 

Plots of least-squares fit (line) of the Michaelis-Menten model along with data (points). The 
plot in (a) shows the transformed fit, and (b) shows how the fit looks when viewed in the 
untransformed, original form. 


The model coefficients can then be calculated as 
>> vm=1/a(2) 
vm = 
0.4083 
>> ks=sqrt(vm*a(1)) 
ks = 
2.8127 
Substituting these values into Eq. (14.29) gives 


PE 0.4083[S]° 
TONERS 
Although we know that a high 7° does not guarantee of a good fit, the fact that it is very 
high (0.9929) is promising. In addition, the parameters values also seem consistent with the 
trends in the data; that is, the k,, is slightly greater than the highest observed velocity and 
the half-saturation rate is lower than the maximum substrate concentration (9). 
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LUR NN DE continued 


(a) Transformed model 


1/[S]° 


(b) Original model 


FIGURE 14.18 

Plots of least-squares fit (line) of the second-order Michaelis-Menten model along with data 
(points). The plot in (a) shows the transformed fit, and (b) shows the untransformed, original 
form. 


The adequacy of the fit can be assessed graphically. As in Fig. 14.18a, the transformed 
results appear linear. When the original equation is plotted along with the data in the 
untransformed version (Fig. 14.18b), the fit nicely follows the trend in the measurements. 
Beyond the graphs, the goodness of the fit is also reflected by the fact that the coefficient 
of determination for the untransformed case can be computed as r* = 0.9896. 

Based on our analysis, we can conclude that the second-order model provides a good 
fit of this data set. This might suggest that we are dealing with an allosteric enzyme. 

Beyond this specific result, there are a few other general conclusions that can be drawn 
from this case study. First, we should never solely rely on statistics such as 7° as the sole 
basis of assessing goodness of fit. Second, regression equations should always be assessed 
graphically. And for cases where transformations are employed, a graph of the untrans- 
formed model and data should always be inspected. 

Finally, although transformations may yield a decent fit of the transformed data, this 
does not always translate into an acceptable fit in the original format. The reason that this 
might occur is that minimizing squared residuals of transformed data is not the same as 
for the untransformed data. Linear regression assumes that the scatter of points around the 
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best-fit line follows a Gaussian distribution, and that the standard deviation is the same at 
every value of the dependent variable. These assumptions are rarely true after transforming 


data. 


As a consequence of the last conclusion, some analysts suggest that rather than using 
linear transformations, nonlinear regression should be employed to fit curvilinear data. 
In this approach, a best-fit curve is developed that directly minimizes the untransformed 
residuals. We will describe how this is done in Chap. 15. 


PROBLEMS 


14.1 Given the data 


0.90 1.42 1.30 1.55 1.63 
1.32 1.35 1.47 1.95 1.66 
1.96 1.47 1.92 1.35 1.05 
1.85 1.74 1.65 1.78 1.71 
2.29 1.82 2.06 2.14 1.27 


Determine (a) the mean, (b) median, (c) mode, (d) range, 
(e) standard deviation, (f) variance, and (g) coefficient of 
variation. 

14.2 Construct a histogram from the data from Prob. 14.1. 
Use a range from 0.8 to 2.4 with intervals of 0.2. 

14.3 Given the data 


29.65 28.55 28.65 
30.65 28.15 29.85 
29.65 30.45 29.15 30.45 33.65 29.35 29.75 
31.25 29.45 30.15 29.65 30.55 29.65 29.25 


Determine (a) the mean, (b) median, (c) mode, (d) range, 
(e) standard deviation, (f) variance, and (g) coefficient of 
variation. 

(h) Construct a histogram. Use a range from 28 to 34 with 
increments of 0.4. 

(i) Assuming that the distribution is normal, and that your 
estimate of the standard deviation is valid, compute the 
range (i.e., the lower and the upper values) that encom- 
passes 68% of the readings. Determine whether this is a 
valid estimate for the data in this problem. 

14.4 Using the same approach as was employed to derive 

Eqs. (14.15) and (14.16), derive the least-squares fit of the 

following model: 


30.15 
29.05 


29.35 29.75 
30.25 30.85 


29.25 
28.75 


y=axte 


That is, determine the slope that results in the least-squares 
fit for a straight line with a zero intercept. Fit the following 
data with this model and display the result graphically. 


14.5 Use least-squares regression to fit a straight line to 


kad 

=] 
N 
> 
D 
Ko) 


11 12 15 17 19 
8 8 10 12- 12 


< 
oa 
a 
N 
fo) 
Ko) 


Along with the slope and intercept, compute the standard 
error of the estimate and the correlation coefficient. Plot the 
data and the regression line. Then repeat the problem, but 
regress x versus y—that is, switch the variables. Interpret 
your results. 

14.6 Fit a power model to the data from Table 14.1, but use 
natural logarithms to perform the transformations. 

14.7 The following data were gathered to determine the 
relationship between pressure and temperature of a fixed 
volume of 1 kg of nitrogen. The volume is 10 m’. 


—40 0 40 80 120 160 
6900 8100 9350 10,500 11,700 12,800 


T,°C 
p, N/m? 


Employ the ideal gas law pV = nRT to determine R on the 
basis of these data. Note that for the law, T must be expressed 
in kelvins. 
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14.8 Beyond the examples in Fig. 14.13, there are other 
models that can be linearized using transformations. For 
example, 


Linearize this model and use it to estimate a, and J} based 
on the following data. Develop a plot of your fit along with 
the data. 


x 0.1 0.2 0.4 0.6 0.9 1.3 1.5 1.7 1.8 
y 0.75 1.25 1.45 1.25 0.85 0.55 0.35 0.28 0.18 


14.9 The concentration of E. coli bacteria in a swimming 
area is monitored after a storm: 


t (hr) 4 8 12 16 20 24 
c(CFUMOO mL) 1600 1320 1000 890 650 560 


The time is measured in hours following the end of the storm 
and the unit CFU is a “colony forming unit.” Use this data 
to estimate (a) the concentration at the end of the storm 
(t = 0) and (b) the time at which the concentration will reach 
200 CFU/100 mL. Note that your choice of model should 
be consistent with the fact that negative concentrations are 
impossible and that the bacteria concentration always de- 
creases with time. 

14.10 Rather than using the base-e exponential model 
[Eq. (14.22)], a common alternative is to employ a base-10 
model: 


y =as10*s* 


When used for curve fitting, this equation yields identi- 
cal results to the base-e version, but the value of the ex- 
ponent parameter (/;) will differ from that estimated with 
Eq. (14.22) (f,). Use the base-10 version to solve Prob. 
14.9. In addition, develop a formulation to relate /, to 25. 
14.11 Determine an equation to predict metabolism rate as 
a function of mass based on the following data. Use it to 
predict the metabolism rate of a 200-kg tiger. 


Animal Mass (kg) Metabolism (watts) 
Cow 400 270 

uman 70 82 
Sheep 45 50 

en 2 4.8 
Rat 0.3 1.45 
Dove 0.16 0.97 


14.12 On average, the surface area A of human beings is 
related to weight W and height H. Measurements on a num- 
ber of individuals of height 180 cm and different weights 
(kg) give values of A (m°) in the following table: 


Wikg) 70 75 77 80 82 84 87 90 
A(m?) 2.10 2.12 2.15 2.20 2.22 2.23 2.26 2.30 


Show that a power law A = aW” fits these data reasonably 
well. Evaluate the constants a and b, and predict what the 
surface area is for a 95-kg person. 
14.13 Fit an exponential model to 


132 
1490 


1.6 2 2.3 
1950 2850 360 


oO 


Plot the data and the equation on both standard and semi- 
logarithmic graphs with the MATLAB subplot function. 
14.14 Aninvestigator has reported the data tabulated below for 
an experiment to determine the growth rate of bacteria k (per d) 
as a function of oxygen concentration c (mg/L). It is known 
that such data can be modeled by the following equation: 


p= Sma 
Cte 
where c, and kmax are parameters. Use a transformation to 
linearize this equation. Then use linear regression to esti- 


mate c, and k,,,, and predict the growth rate at c = 2 mg/L. 


=F 


14.15 Develop an M-file function to compute descriptive 
statistics for a vector of values. Have the function determine 
and display number of values, mean, median, mode, range, 
standard deviation, variance, and coefficient of variation. In 
addition, have it generate a histogram. Test it with the data 
from Prob. 14.3. 

14.16 Modify the linregr function in Fig. 14.15 so that it 
(a) computes and returns the standard error of the estimate 
and (b) uses the subplot function to also display a plot of 
the residuals (the predicted minus the measured y) versus x. 
Test it for the data from Examples 14.2 and 14.3. 

14.17 Develop an M-file function to fit a power model. 
Have the function return the best-fit coefficient a, and 
power p, along with the r? for the untransformed model. In 
addition, use the subplot function to display graphs of both 
the transformed and untransformed equations along with the 
data. Test it with the data from Prob. 14.11. 
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14.18 The following data show the relationship between the 
viscosity of SAE 70 oil and temperature. After taking the log 
of the data, use linear regression to find the equation of the 
line that best fits the data and the r? value. 


26.67 93.33 148.89 315.56 
1.35 0.085 0.012 0.00075 


Temperature, °C 
Viscosity, u, N-s/m? 


14.19 You perform experiments and determine the follow- 
ing values of heat capacity c at various temperatures T for 
a gas: 


| 


—50 
1250 


—30 0 60 90 


1280 1350 1480 1580 1700 


5 


Use regression to determine a model to predict c as a func- 
tion of T. 

14.20 It is known that the tensile strength of a plastic in- 
creases as a function of the time it is heat treated. The fol- 
lowing data are collected: 


10 15 20 25 40 50 55 60 75 
20 18 40 33 54 70 60 78 


Time 
Tensile Strength 5 


(a) Fit a straight line to these data and use the equation to 
determine the tensile strength at a time of 32 min. 

(b) Repeat the analysis but for a straight line with a zero 
intercept. 

14.21 The following data were taken from a stirred tank 

reactor for the reaction A > B. Use the data to determine 

the best possible estimates for kọ, and E, for the following 

kinetic model: 


= a = kye "iTA 


where R is the gas constant and equals 0.00198 kcal/mol/K. 


—dA/dt (moles/L/s) 460 960 2485 1600 1245 
A (moles/L) 200 150 50 20 10 
T (K) 280 320 450 500 550 


14.22 Concentration data were collected at 15 time points 
for the polymerization reaction: 


xA + yB > A,B, 


We assume the reaction occurs via a complex mechanism 
consisting of many steps. Several models have been hypoth- 
esized, and the sum of the squares of the residuals had been 
calculated for the fits of the models of the data. The results 


are shown below. Which model best describes the data (sta- 
tistically)? Explain your choice. 


Model A Model B Model C 
S, 135 105 100 
Number of Model 
Parameters Fit 2 3 5 


14.23 Below are data taken from a batch reactor of bacterial 
growth (after lag phase was over). The bacteria are allowed 
to grow as fast as possible for the first 2.5 hours, and then 
they are induced to produce a recombinant protein, the pro- 
duction of which slows the bacterial growth significantly. 
The theoretical growth of bacteria can be described by 


aoe 

where X is the number of bacteria, and u is the specific 
growth rate of the bacteria during exponential growth. Based 
on the data, estimate the specific growth rate of the bacteria 
during the first 2 hours of growth and during the next 4 hours 
of growth. 


h 0 1 2 3 4 5 6 
[Cells], 
g/L 0.100 0.335 1.102 1.655 2.453 3.702 5.460 


14.24 A transportation engineering study was conducted to 
determine the proper design of bike lanes. Data were gath- 
ered on bike-lane widths and average distance between bikes 
and passing cars. The data from 9 streets are 


Distance, m 
Lane Width, m 


(a) Plot the data. 

(b) Fit a straight line to the data with linear regression. Add 
this line to the plot. 

(c) If the minimum safe average distance between bikes and 
passing cars is considered to be 1.8 m, determine the 
corresponding minimum lane width. 

14.25 In water-resources engineering, the sizing of reser- 

voirs depends on accurate estimates of water flow in the 

river that is being impounded. For some rivers, long-term 
historical records of such flow data are difficult to obtain. 

In contrast, meteorological data on precipitation are often 

available for many years past. Therefore, it is often useful 
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to determine a relationship between flow and precipitation. 
This relationship can then be used to estimate flows for 
years when only precipitation measurements were made. 
The following data are available for a river that is to be 


cm/yr 88.9 108.5 104.1 139.7 127 94 116.8 99.1 
Flow, 


m¥s 14.6 16.7 15.3 23.2 19.5 16.1 18.1 16.6 


(a) Plot the data. 

(b) Fit a straight line to the data with linear regression. 
Superimpose this line on your plot. 

(c) Use the best-fit line to predict the annual water flow if 
the precipitation is 120 cm. 

(d) If the drainage area is 1100 km?, estimate what fraction 
of the precipitation is lost via processes such as evapo- 
ration, deep groundwater infiltration, and consumptive 
use. 

14.26 The mast of a sailboat has a cross-sectional area of 

10.65 cm? and is constructed of an experimental alumi- 

num alloy. Tests were performed to define the relationship 

between stress and strain. The test results are 


Strain, 
em/em 0.0032 0.0045 0.0055 0.0016 0.0085 0.0005 
Stress, 


N/cm? 4970 5170 5500 3590 6900 1240 


The stress caused by wind can be computed as F/A, where F = 
force in the mast and A, = mast’s cross-sectional area. This 
value can then be substituted into Hooke’s law to determine 
the mast’s deflection, AL strain x L, where L = the mast’s 
length. If the wind force is 25,000 N, use the data to estimate 
the deflection of a 9-m mast. 

14.27 The following data were taken from an experiment 
that measured the current in a wire for various imposed 
voltages: 


VV 2 3 4 5 7 10 
} 2159 


(a) On the basis of a linear regression of this data, determine 
current for a voltage of 3.5 V. Plot the line and the data 
and evaluate the fit. 

(b) Redo the regression and force the intercept to be zero. 


14.28 An experiment is performed to determine the % elon- 
gation of electrical conducting material as a function of tem- 
perature. The resulting data are listed below. Predict the % 
elongation for a temperature of 400 °C. 


200 250 300 375 425 475 600 
7.5 8.6 8.7 10 11.3 12.7 15.3 


Temperature, °C 
% Elongation 


14.29 The population p of a small community on the out- 
skirts of a city grows rapidly over a 20-year period: 


~ 


0 5 10 I5 20 
100 200 450 950 2000 


s 


As an engineer working for a utility company, you must 
forecast the population 5 years into the future in order to 
anticipate the demand for power. Employ an exponential 
model and linear regression to make this prediction. 

14.30 The velocity u of air flowing past a flat surface is 
measured at several distances y away from the surface. Fit 
a curve to this data assuming that the velocity is zero at the 
surface (y = 0). Use your result to determine the shear stress 
(u du/dy) at the surface where u =1.8 X 107 N-s/m?. 


0.002 
0.287 


0.006 
0.899 


0.012 
1.915 


0.018 
3.048 


0.024 
4.299 


y,m 
u, m/s 


14.31 Andrade’s equation has been proposed as a model of 
the effect of temperature on viscosity: 


u= De®/ Ta 

where u = dynamic viscosity of water (10° N -s/m°), T, = 
absolute temperature (K), and D and B are parameters. Fit 
this model to the following data for water: 


s 


0 5 10 20 30 40 
1.787 1.519 1.307 1.002 0.7975 0.6529 


z 


14.32 Perform the same computation as in Example 14.2, 
but in addition to the drag coefficient, also vary the mass 
uniformly by +10%. 

14.33 Perform the same computation as in Example 14.3, 
but in addition to the drag coefficient, also vary the mass 
normally around its mean value with a coefficient of varia- 
tion of 5.7887%. 
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14.34 Manning’s formula for a rectangular channel can be 
written as 


_1 _ 2223 


nm (B + 2H)2/3 


where Q = flow (m?/s), n,, = a roughness coefficient, B = 
width (m), H = depth (m), and S = slope. You are applying 
this formula to a stream where you know that the width = 20 m 
and the depth = 0.3 m. Unfortunately, you know the rough- 
ness and the slope to only a +10% precision. That is, you 
know that the roughness is about 0.03 with a range from 0.027 
to 0.033 and the slope is 0.0003 with a range from 0.00027 
to 0.00033. Assuming uniform distributions, use a Monte 
Carlo analysis with n = 10,000 to estimate the distribution 
of flow. 

14.35 A Monte Carlo analysis can be used for optimization. 
For example, the trajectory of a ball can be computed with 
y = (tan@,)x — — 3 £ 


= (P14.35) 
209 Cos’ Oy 


x + Yo 
where y = the height (m), 6) = the initial angle (radians), 
Vo = the initial velocity (m/s), g = the gravitational constant = 
9.81 m/s”, and yọ = the initial height (m). Given yọ = 1 m, 
Vo = 25 m/s, and @ = 50°, determine the maximum height 
and the corresponding x distance (a) analytically with cal- 
culus and (b) numerically with Monte Carlo simulation. For 
the latter, develop a script that generates a vector of 10,000 
uniformly distributed values of x between 0 and 60 m. Use 
this vector and Eq. (P14.35) to generate a vector of heights. 
Then, employ the max function to determine the maximum 
height and the associated x distance. 

14.36 Stokes Settling Law provides a means to compute 
the settling velocity of spherical particles under laminar 
conditions 


_ & Ps~P 2 
ae m l 
where v, = the terminal settling velocity (m/s), g = 


gravitational acceleration (= 9.81 m/s”), p = the fluid 
density (kg/m*), p, = the particle density (kg/m*), u = 
the dynamic viscosity of the fluid (N s/m’), and d = the 
particle diameter (m). Suppose that you conduct an ex- 
periment in which you measure the terminal settling ve- 
locities of a number of 10-um spheres having different 
densities, 


Py kg/m? 1500 1600 1700 1800 1900 2000 2100 2200 2300 
v, 10-3 m/s 1.03 1.12 1.59 1.76 2.42 2.51 3.06 3 3.5 


(a) Generate a labeled plot of the data. (b) Fit a straight line 
to the data with linear regression (polyfit) and superimpose 
this line on your plot. (c) Use the model to predict the set- 
tling velocity of a 2500 kg/m? density sphere. (d) Use the 
slope and the intercept to estimate the fluid’s viscosity and 
density. 

14.37 Beyond the examples in Fig. 14.13, there are other 
models that can be linearized using transformations. For ex- 
ample, the following model applies to third-order chemical 
reactions in batch reactors 


on ee 
V1 + 2keot 


where c = concentration (mg/L), cy = initial concentration 
(mg/L), k = reaction rate (L7/(mg? d)), and t = time (d). Lin- 
earize this model and use it to estimate k and cy based on the 
following data. Develop plots of your fit along with the data 
for both the transformed, linearized form and the untrans- 
formed form. 


C= Cj 


~ 


0 0.5 1 1.5 2 3 4 5 
c 3.26 2.09 1.62 1.48 1.17 1.06 0.9 0.85 


14.38 In Chap.7 we presented optimization techniques to 
find the optimal values of one- and multi-dimensional func- 
tions. Random numbers provide an alternative means to 
solve the same sorts of problems (recall Prob. 14.35). This is 
done by repeatedly evaluating the function at randomly se- 
lected values of the independent variable and keeping track 
of the one that yields the best value of the function being 
optimized. If a sufficient number of samples are conducted, 
the optimum will eventually be located. In their simplest 
manifestations, such approaches are not very efficient. How- 
ever, they do have the advantage that they can detect global 
optimums for functions with lots of local optima. Develop a 
function that uses random numbers to locate the maximum 
of the humps function 


1 + 1 


= —6 
FO) (x -— 0.3)? +0.01 (x-0.9)? + 0.04 


in the domain bounded by x = 0 to 2. Here is a script that you 
can use to test your function 


clear,clc,clf,format compact 
xmin=0;xmax=2;n=1000 
xp=linspace(xmin, xmax,200); yp=f(xp); 
plot(xp, yp) 

[xopt, fopt ]=RandOpt(@f ,n,xmin, xmax) 
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FIGURE P14.39 
A two-dimensional function with a maximum of 1.25 at 
x=-landy=1.5. 


14.39 Using the same approach as described in Prob. 14.38, 
develop a function that uses random numbers to determine 
the maximum and corresponding x and y values of the fol- 
lowing two-dimensional function 


f@,y)=y—-x- 2° -2ry-y 


in the domain bounded by x = —2 to 2 and y = 1 to 3. The 
domain is depicted in Fig. P14.39. Notice that a single maxi- 
mum of 1.25 occurs at x = —1 and y = 1.5. Here is a script 
that you can use to test your function 


clear,clc,format compact 
xint=[-2;2];yint=[1;3];n=10000; 
[xopt, yopt , fopt ]=RandOpt2D(@fxy,n,xint, yint) 


14.40 Suppose that a population of particles is confined to 
motion along a one-dimensional line (Fig. P14.40). Assume 
that each particle has an equal likelihood of moving a dis- 
tance, Ax, to either the left or right over a time step, At. 
At t = 0 all particles are grouped at x = 0 and are allowed 
to take one step in either direction. After At, approximately 
50% will step to the right and 50% to the left. After 2At, 25% 
would be two steps to the left, 25% would be two steps to 
the right, and 50% would have stepped back to the origin. 
With additional time, the particles would spread out with 
the population greater near the origin and diminishing at the 
ends. The net result is that the distribution of the particles 
approaches a spreading bell-shaped distribution. This pro- 
cess, formally called a random walk (or drunkard’s walk), 
describes many phenomena in engineering and science with 
a common example being Brownian motion. Develop a 
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FIGURE P14.40 
The one-dimensional random or “drunkard’s” walk. 


MATLAB function that is given a stepsize (Ax), and a total 
number of particles (n) and steps (m). At each step, deter- 
mine the location along the x axis of each particle and use 
these results to generate an animated histogram that displays 
how the distribution’s shape evolves as the computation 
progresses. 

14.41 Repeat Prob. 14.40, but for a two-dimensional ran- 
dom walk. As depicted in Fig. P14.41, have each particle 
take a random step of length A at a random angle 0 ranging 
from 0 to 2x. Generate an animated two panel stacked plot 
with the location of all the particles displayed on the top plot 
(subplot (2,1,1)), and the histogram of the particles’ x coor- 
dinates on the bottom (subplot (2,1,2)). 
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FIGURE P14.41 
Depiction of the steps of a two-dimensional random or 
random walk. 
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14.42 The table below shows the 2015 world record times 
and holders for outdoor running. Note that all but the 100 m 
and the marathon (42,195 m) are run on oval tracks. 


Fit a power model for each gender and use it to predict the 
record time for a half marathon (21,097.5 m). Note that the 
actual records for the half marathon are 3503 s (Tadese) and 
3909 s (Kiplagat) for men and women, respectively. 


Event (m) Time (s) Men Holder Time (s) Women Holder 
100 9.58 Bolt 10.49 Griffith-Joyner 
200 19.19 Bolt 21.34 Griffith-Joyner 
400 43.18 Johnson 47.60 Koch 
800 100.90 Rudisha 113.28 Kratochvilova 

1000 131.96 Ngeny 148.98 Masterkova 
1500 206.00 El Guerrouj 230.07 Dibaba 
2000 284.79 El Guerrouj 3925:35 O'Sullivan 
5000 757.40 Bekele 851.15 Dibaba 
10,000 1577.53 Bekele 1771.78 Wang 
20,000 3386.00 Gebrselassie 3926.60 Loroupe 
42,195 7377.00 Kimetto 8125.00 Radcliffe 
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General Linear Least-Squares and 
Nonlinear Regression 


CHAPTER OBJECTIVES 


This chapter takes the concept of fitting a straight line and extends it to (a) fitting 

a polynomial and (b) fitting a variable that is a linear function of two or more 
independent variables. We will then show how such applications can be generalized 
and applied to a broader group of problems. Finally, we will illustrate how 
optimization techniques can be used to implement nonlinear regression. Specific 
objectives and topics covered are 


Knowing how to implement polynomial regression. 

Knowing how to implement multiple linear regression. 

Understanding the formulation of the general linear least-squares model. 
Understanding how the general linear least-squares model can be solved with 
MATLAB using either the normal equations or left division. 

Understanding how to implement nonlinear regression with optimization 
techniques. 


POLYNOMIAL REGRESSION 


In Chap.14, a procedure was developed to derive the equation of a straight line using the 
least-squares criterion. Some data, although exhibiting a marked pattern such as seen in 
Fig. 15.1, are poorly represented by a straight line. For these cases, a curve would be bet- 
ter suited to fit the data. As discussed in Chap. 14, one method to accomplish this objec- 
tive is to use transformations. Another alternative is to fit polynomials to the data using 
polynomial regression. 

The least-squares procedure can be readily extended to fit the data to a higher-order 
polynomial. For example, suppose that we fit a second-order polynomial or quadratic: 


yHataxtax te (15.1) 
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FIGURE 15.1 
(a) Data that are ill-suited for linear least-squares regression. (b) Indication that a parabola is 
preferable. 


For this case the sum of the squares of the residuals is 


n 
S,= È (y:— ay = a; = a7) (15.2) 
i=l 
To generate the least-squares fit, we take the derivative of Eq. (15.2) with respect to 
each of the unknown coefficients of the polynomial, as in 


ðS, 

day =-2 x; — Ay — AX; — Ayx*) 
0S, 2 
ða, = -2 J, x; (Y; — do = 44x; = 4x7) 
0S, 

da, =-2 x; (y; = dy) — 4X; — a,x) 


These equations can be set equal to zero and rearranged to develop the following set of 
normal equations: 


(na + (Lx) a, + (Lx), = Èy; 
(Lxi)ao + (Daj )ar + (Lr) = Xx; Y; 
(È x;)ao + (Liar + (Lexa = Lexy; 
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EXAMPLE 15.1 


where all summations are from i = 1 through n. Note that the preceding three equations are 
linear and have three unknowns: do, a}, and a,. The coefficients of the unknowns can be 
calculated directly from the observed data. 

For this case, we see that the problem of determining a least-squares second-order 
polynomial is equivalent to solving a system of three simultaneous linear equations. The 
two-dimensional case can be easily extended to an mth-order polynomial as in 


2 
YHA + axtax +++ +4,x" +e 


The foregoing analysis can be easily extended to this more general case. Thus, we can 
recognize that determining the coefficients of an mth-order polynomial is equivalent to 
solving a system of m + 1 simultaneous linear equations. For this case, the standard error 
is formulated as 


Sy ix = T- (m+ (15.3) 


This quantity is divided by n — (m + 1) because (m + 1) data-derived coefficients— 
dp, 4j, ..., Ap — Were used to compute S,; thus, we have lost m + 1 degrees of freedom. 
In addition to the standard error, a coefficient of determination can also be computed for 


polynomial regression with Eq. (14.20). 


Polynomial Regression 


Problem Statement. Fit a second-order polynomial to the data in the first two columns 
of Table 15.1. 


TABLE 15.1 Computations for an error analysis of the quadratic least-squares fit. 


x; Ji CA (Y;— A — 4X; — a,x) 
0 Pel 544.44 0.14332 
1 et 314.47 1.00286 
2 13.6 140.03 1.08160 
3 2732 3.12 0.80487 
4 40.9 239.22 0.61959 
5 61.1 1272.11 0.09434 
by 152.6 2513.39 3.74657 


Solution. The following can be computed from the data: 


m=2 Vx; = 15 yx} = 979 
n=6 Dy; = 152.6 Dx; y; = 585.6 
X= 2.5 Yeas E xy; = 2488.8 


y =25.433 Xx = 225 
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Therefore, the simultaneous linear equations are 


6 15 55 do 152.6 
15 55 225 ar p =4 585.6 
55 225 979 ay 2488.8 


These equations can be solved to evaluate the coefficients. For example, using MATLAB: 


>> N = [6 15 55;15 55 225;55 225 979]; 
>> r = [152.6 585.6 2488.8]; 
>> a = N\r 


a= 
2.4786 
2.3593 
1.8607 


Therefore, the least-squares quadratic equation for this case is 
y = 2.4786 + 2.3593x + 1.8607x" 


The standard error of the estimate based on the regression polynomial is [Eq. (15.3)] 


= |21746571 = 
Sa ie or P 


The coefficient of determination is 


r2 = 2513.39 — 3.74657 
2513:39 


and the correlation coefficient is r = 0.99925 


= 0.99851 


FIGURE 15.2 
Fit of a second-order polynomial. 
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These results indicate that 99.851 percent of the original uncertainty has been ex- 
plained by the model. This result supports the conclusion that the quadratic equation 
represents an excellent fit, as is also evident from Fig. 15.2. 


15.2 


MULTIPLE LINEAR REGRESSION 


Another useful extension of linear regression is the case where y is a linear function of two 
or more independent variables. For example, y might be a linear function of x, and x,, as in 


y =a t+ aX, + a,x, + e 


Such an equation is particularly useful when fitting experimental data where the variable 
being studied is often a function of two other variables. For this two-dimensional case, the 
regression “line” becomes a “plane” (Fig. 15.3). 

As with the previous cases, the “best” values of the coefficients are determined by 
formulating the sum of the squares of the residuals: 


n 
S,= be Yi- A — aX aX i (15.4) 
i=l 
and differentiating with respect to each of the unknown coefficients: 
oS, 
dag =-2 LO; = Ay — A,X, ; — Az Xp)j) 


aS, 


ða, =-2 >32 Xii (Vi — Ag — 41X1 — Ay Xp) 


aS, 


i =-2y X2; (Yi Z 4o T 41X1; — HX2;) 


FIGURE 15.3 
Graphical depiction of multiple linear regression where y is a linear function of x, and x. 
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EXAMPLE 15.2 


The coefficients yielding the minimum sum of the squares of the residuals are obtained 
by setting the partial derivatives equal to zero and expressing the result in matrix form as 


n Èx Èw; AQ Èy; 
Èx Èx, wx X ai $= Èx i (15.5) 
pa Xi D*1i Xai ps Xi a, pie on 


Multiple Linear Regression 


Problem Statement. The following data were created from the equation y = 5 + 4x, — 3x;: 


Xi X2 y 
0 0 5 
2 1 10 
2.5 2 9 
1 3 0 
4 6 3 
7 2 27 


Use multiple linear regression to fit this data. 


Solution. The summations required to develop Eq. (15.5) are computed in Table 15.2. 
Substituting them into Eq. (15.5) gives 


6 165 14] (% 54 
16.5 76.25 48) < 4 p = 4 243.5 


(15.6) 
14 48 54 ay 100 
which can be solved for 
ay = 5 a,=4 a, = —3 


which is consistent with the original equation from which the data were derived. 


The foregoing two-dimensional case can be easily extended to m dimensions, as in 


y = Qo + 41X1 + AX +++ H A,X, Ee 


TABLE 15.2 Computations required to develop the normal equations for Example 15.2. 


y xy x, xi x X2 Xy Xay 
5 0 0 0 0 0 0 0 
10 2 1 4 1 2 20 10 
9 2.5 2 6.25 4 5 22.5 18 
0 1 3 1 9 3 0 0 
3 4 6 16 36 24 12 18 
27 7 2 49 4 14 189 54 
54 16.5 14 76.25 54 48 243.5 100 
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15.3 


where the standard error is formulated as 


— ya Sp 
S= Vin m+ D 


and the coefficient of determination is computed with Eq. (14.20). 

Although there may be certain cases where a variable is linearly related to two or more 
other variables, multiple linear regression has additional utility in the derivation of power 
equations of the general form 


= My 42... yn 
YHA, Xa Xin 


Such equations are extremely useful when fitting experimental data. To use multiple linear 
regression, the equation is transformed by taking its logarithm to yield 


log y = log dy + a, log x, + a, log x) + +++ + am log x,, 


GENERAL LINEAR LEAST SQUARES 


In the preceding pages, we have introduced three types of regression: simple linear, 
polynomial, and multiple linear. In fact, all three belong to the following general linear 
least-squares model: 


Y = Aly + 4121 + aZ +--+ FAnn +e (15.7) 


'm 


where Zo, Zi >- - > Zm are m + 1 basis functions. It can easily be seen how simple linear 
and multiple linear regression fall within this model—that is, z) = 1, z4 = X}; Z3 = X3 <., 
Zm = X,, Further, polynomial regression is also included if the basis functions are simple 
monomials as ng, = 1, Z} = X, 2, = X’, o o o y Zp = X”. 

Note that the terminology “linear” refers only to the model’s dependence on its 
parameters—that is, the a’s. As in the case of polynomial regression, the functions them- 


selves can be highly nonlinear. For example, the z’s can be sinusoids, as in 
y = a + a, cos(@x) + a, sin(@x) 


Such a format is the basis of Fourier analysis. 
On the other hand, a simple-looking model such as 
y=a(l—e“") 


is truly nonlinear because it cannot be manipulated into the format of Eq. (15.7). 
Equation (15.7) can be expressed in matrix notation as 


{y} =[Z]{a} + {e} (15.8) 
where [Z] is a matrix of the calculated values of the basis functions at the measured values 


of the independent variables: 


Zor Zi ttt Sm 


Zo 212 +++ Tm 
[Z] = 


Zon Zin °? Sinn 
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EXAMPLE 15.3 


where m is the number of variables in the model and n is the number of data points. Because 
n > m + 1, you should recognize that most of the time, [Z] is not a square matrix. 
The column vector {y} contains the observed values of the dependent variable: 


{y}7= Ly Yo ae ¥ 


The column vector {a} contains the unknown coefficients: 


{a} = Lao ay mrt Ay, 


and the column vector {e} contains the residuals: 
{fe} = Le, e2 cts ey 
The sum of the squares of the residuals for this model can be defined as 
n n 2 
S=} b -È az (15.9) 
i=l j=0 °° 


This quantity can be minimized by taking its partial derivative with respect to each of the 
coefficients and setting the resulting equation equal to zero. The outcome of this process is 
the normal equations that can be expressed concisely in matrix form as 


[[Z]" [Z] {a} = {[Z]" (y} (15.10) 


It can be shown that Eq. (15.10) is, in fact, equivalent to the normal equations developed 
previously for simple linear, polynomial, and multiple linear regression. 

The coefficient of determination and the standard error can also be formulated in 
terms of matrix algebra. Recall that 7° is defined as 


2 = S, - S, = 1 5 S, 
S, S, 
Substituting the definitions of S, and S, gives 
— >. 2 
Poje 20:79) 
2OY) 


where ĵ = the prediction of the least-squares fit. The residuals between the best-fit curve 
and the data, y; — ĵ, can be expressed in vector form as 


{y} — [Z]{a} 


Matrix algebra can then be used to manipulate this vector to compute both the coefficient of 
determination and the standard error of the estimate as illustrated in the following example. 


Polynomial Regression with MATLAB 


Problem Statement. Repeat Example 15.1, but use matrix operations as described in this 
section. 


Solution. First, enter the data to be fit 


>x=[012345]'; 
>> y = [2.1 7.7 13.6 27.2 40.9 61.1]'; 
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Next, create the [Z] matrix: 


>> Z = [ones(size(x)) x x.42] 
Le 


PRPPPRPPR 
ORWNrRO 
UnROBRRrRO 


1 
2 


We can verify that [Z]’ [Z] results in the coefficient matrix for the normal equations: 


>> Z'*Z 
ans = 
6 15 55 
15 55 225 
55 225 979 


This is the same result we obtained with summations in Example 15.1. We can solve for the 
coefficients of the least-squares quadratic by implementing Eq. (15.10): 
>> a = (Z'*Z)\(Z'*y) 


ans = 
2.4786 
2.3593 
1.8607 


In order to compute 7° and s,,,, first compute the sum of the squares of the residuals: 


y/0? 
>> Sr = sum((y-Z*a) .42) 


Sr = 
3.7466 


Then 7? can be computed as 


>> r2 = 1-Sr/sum((y-mean(y)).42) 


r2 = 
0.9985 


and s,,, can be computed as 


>> syx = sqrt(Sr/(length(x) - length(a) )) 


syx = 
1.1175 


Our primary motivation for the foregoing has been to illustrate the unity among the 
three approaches and to show how they can all be expressed simply in the same matrix 
notation. It also sets the stage for the next section where we will gain some insights into the 
preferred strategies for solving Eq. (15.10). The matrix notation will also have relevance 
when we turn to nonlinear regression in Sec. 15.5. 
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15.4 


EXAMPLE 15.4 


QR FACTORIZATION AND THE BACKSLASH OPERATOR 


Generating a best fit by solving the normal equations is widely used and certainly adequate 
for many curve-fitting applications in engineering and science. It must be mentioned, how- 
ever, that the normal equations can be ill-conditioned and hence sensitive to roundoff errors. 

Two more advanced methods, QR factorization and singular value decomposition, are 
more robust in this regard. Although the description of these methods is beyond the scope 
of this text, we mention them here because they can be implemented with MATLAB. 

Further, QR factorization is automatically used in two simple ways within MATLAB. 
First, for cases where you want to fit a polynomial, the built-in polyfit function automati- 
cally uses QR factorization to obtain its results. 

Second, the general linear least-squares problem can be directly solved with the back- 
slash operator. Recall that the general model is formulated as Eq. (15.8) 


{y} =[Z]{a} (15.11) 


In Sec. 10.4, we used left division with the backslash operator to solve systems of linear alge- 
braic equations where the number of equations equals the number of unknowns (n = m). For 
Eq. (15.8) as derived from general least squares, the number of equations is greater than the 
number of unknowns (n > m). Such systems are said to be overdetermined. When MATLAB 
senses that you want to solve such systems with left division, it automatically uses QR fac- 
torization to obtain the solution. The following example illustrates how this is done. 


Implementing Polynomial Regression with polyfit and Left Division 


Problem Statement. Repeat Example 15.3, but use the built-in polyfit function and left 
division to calculate the coefficients. 


Solution. As in Example 15.3, the data can be entered and used to create the [Z] matrix 
as in 


>> x=[012345]'; 
>> y = [2.1 7.7 13.6 27.2 40.9 61.1]'; 
>> Z = [ones(size(x)) x x.42]; 


The polyfit function can be used to compute the coefficients: 


>> a = polyfit(x,y,2) 


a = 

1.8607 2.3593 2.4786 

The same result can also be calculated using the backslash: 
>> a = Z\y 
a = 

2.4786 

2.3593 

1.8607 


As just stated, both these results are obtained automatically with QR factorization. 
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15.5 


EXAMPLE 15.5 


NONLINEAR REGRESSION 


There are many cases in engineering and science where nonlinear models must be fit to 
data. In the present context, these models are defined as those that have a nonlinear depen- 
dence on their parameters. For example, 


y=a(1-e“")+e (15.12) 


This equation cannot be manipulated so that it conforms to the general form of Eq. (15.7). 

As with linear least squares, nonlinear regression is based on determining the values 
of the parameters that minimize the sum of the squares of the residuals. However, for the 
nonlinear case, the solution must proceed in an iterative fashion. 

There are techniques expressly designed for nonlinear regression. For example, the 
Gauss-Newton method uses a Taylor series expansion to express the original nonlinear 
equation in an approximate, linear form. Then least-squares theory can be used to obtain 
new estimates of the parameters that move in the direction of minimizing the residual. 
Details on this approach are provided elsewhere (Chapra and Canale, 2010). 

An alternative is to use optimization techniques to directly determine the least-squares 
fit. For example, Eq. (15.12) can be expressed as an objective function to compute the sum 
of the squares: 


n 


fay, a) = È Yi- ad — 4) /P (15.13) 
= 
An optimization routine can then be used to determine the values of a, and a, that minimize 
the function. 
As described previously in Sec. 7.3.1, MATLAB’s fminsearch function can be used for 
this purpose. It has the general syntax 


[x, fval] = fminsearch( fun,x0,options,p1,p2,...) 


where x = a vector of the values of the parameters that minimize the function fun, fval = 
the value of the function at the minimum, x0 = a vector of the initial guesses for the param- 
eters, options = a structure containing values of the optimization parameters as created 
with the optimset function (recall Sec. 6.5), and p1, p2, etc. = additional arguments that 
are passed to the objective function. Note that if options is omitted, MATLAB uses 
default values that are reasonable for most problems. If you would like to pass addi- 
tional arguments (p1, p2,...), but do not want to set the options, use empty brackets [] 
as a place holder. 


Nonlinear Regression with MATLAB 


Problem Statement. Recall that in Example 14.6, we fit the power model to data from 
Table 14.1 by linearization using logarithms. This yielded the model: 


F = 0.27410!°8? 


Repeat this exercise, but use nonlinear regression. Employ initial guesses of 1 for the 
coefficients. 
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Solution. First, an M-file function must be created to compute the sum of the squares. The 
following file, called fSSR.m, is set up for the power equation: 
function f = fSSR(a,xm, ym) 
yp = a(1)*xm.4a(2); 
f = sum((ym-yp).*2); 
In command mode, the data can be entered as 
>> x = [10 20 30 40 50 60 70 80]; 
>> y = [25 70 380 550 610 1220 830 1450]; 
The minimization of the function is then implemented by 
>> fminsearch(@fSSR, [1, 1], [], x, y) 
ans = 
2.5384 1.4359 


The best-fit model is therefore 


F = 2.53849” 


Both the original transformed fit and the present version are displayed in Fig. 15.4. 
Note that although the model coefficients are very different, it is difficult to judge which fit 
is superior based on inspection of the plot. 

This example illustrates how different best-fit equations result when fitting the same 
model using nonlinear regression versus linear regression employing transformations. This 
is because the former minimizes the residuals of the original data whereas the latter mini- 
mizes the residuals of the transformed data. 


FIGURE 15.4 
Comparison of transformed and untransformed model fits for force versus velocity data from 
Table 14.1. 
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LEAM NEDE FITTING EXPERIMENTAL DATA 


Background. As mentioned at the end of Sec. 15.2, although there are many cases where 
a variable is linearly related to two or more other variables, multiple linear regression has 
additional utility in the derivation of multivariable power equations of the general form 


Aa ee (15.14) 
Such equations are extremely useful when fitting experimental data. To do this, the equa- 
tion is transformed by taking its logarithm to yield 


log y = log ay) + a, log x, + a, log x,--- +a, log Xm (15.15) 


Thus, the logarithm of the dependent variable is linearly dependent on the logarithms of 
the independent variables. 

A simple example relates to gas transfer in natural waters such as rivers, lakes, 
and estuaries. In particular, it has been found that the mass-transfer coefficient of dis- 
solved oxygen K, (m/d) is related to a river’s mean water velocity U (m/s) and depth 
H (m) by 


1S, m U T (15.16) 


Taking the common logarithm yields 


log K, = log a, + a, logU + a, log H (15.17) 
The following data were collected in a laboratory flume at a constant temperature 
of 20/2¢: 
U ORS) 2 10 ORS 2 10 ORS 2 10 
H ORS ORS om5 ORS ORS ORS ORS) ORS: ORS 
Ki 0.48 3.9 57 0.85 5 1 0.8 9 92 


Use these data and general linear least squares to evaluate the constants in Eq. (15.16). 


Solution. Ina similar fashion to Example 15.3, we can develop a script to assign the data, 
create the [Z] matrix, and compute the coefficients for the least-squares fit: 


% Compute best fit of transformed values 
clc; format short g 

WSO 2 lo) W's 2 ale) Wes 2 alos 

S| oa Weak ois) 8) Os} e We) o Ws'5|" 
KL=[0.48 3.9 57 0.85 5 77 0.8 9 92]'; 

TogU = 10g10(U) ; logH=10g10(H) ; logKL=10g10(KL) ; 
Z=[ones(size(logKL)) logU logH]; 
a=(Z'*Z)\(Z'*logKL) 
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15.6 NEDA continued 


with the result: 


a = 
0.57627 
1.562 
0.50742 


Therefore, the best-fit model is 


log K, = 0.57627 + 1.562 log U + 0.50742 log H 
or in the untransformed form (note, a) = 10057627 = 3.7694), 
K, = 3.76941 3629 7.5074 


The statistics can also be determined by adding the following lines to the script: 


% Compute fit statistics 

Sr =sum((1ogKL-Z*a) .*2) 
r2=1-Sr/sum((1logKL-mean(logKL) ).*2) 
syx =sqrt(Sr/(length( logKL)-length(a))) 


Sr = 
0.024171 
r2 = 
0.99619 
syx = 
0.063471 


Finally, plots of the fit can be developed. The following statements display the model 
predictions versus the measured values for K,. Subplots are employed to do this for both 
the transformed and untransformed versions. 


%#Generate plots 

clf 

KLpred=10^a(1)*U.^a(2).*H.^a(3); 
KLmin=min(KL) ;KLmax=max(KL) ; 
dKL=(KLmax-KLmin) /100; 

KLmod=[KLmin: dKL:KLmax] ; 

subplot(1,2,1) 
loglog(KLpred,KL, 'ko' ,KLmod,KLmod, 'k-') 

axis square,title('(a) log-log plot’) 
legend('model prediction','1:1 

line', 'Location', 'NorthWest' ) 

xlabel('log(K_L) measured'),ylabel('log(K_L) predicted' ) 
subplot(1,2,2) 

plot(KLpred,KL, 'ko',KLmod,KLmod, 'k-' ) 

axis square,title('(b) untransformed plot') 
legend('model prediction','1:1 

line', 'Location', 'NorthWest' ) 

xlabel('K_L measured'),ylabel('K_L predicted') 
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15.6 CASE STUDY continued 


The result is shown in Fig. 15.5. 


(a) log-log plot (b) untransformed plot 
10? 100 
O model prediction O model prediction O 
— 1:1 line — 1:1 line 
o w 
z oa 8 
g 5 
a 2 
Z a 
g 10° = 
10n. | | 
1071 10° 10! 10? © 20 40 60 80 100 
log (K,) measured K, measured 
FIGURE 15.5 


Plots of predicted versus measured values of the oxygen mass-transfer coefficient as 
computed with multiple regression. Results are shown for (a) log transformed and (b) 
untransformed cases. The 1:1 line, which indicates a perfect correlation, is superimposed 


on both plots. 


PROBLEMS 


15.1 Fit a parabola to the data from Table 14.1. Determine 
the r? for the fit and comment on the efficacy of the result. 
15.2 Using the same approach as was employed to derive 
Eqs. (14.15) and (14.16), derive the least-squares fit of the 
following model: 


yHaxtax +e 
That is, determine the coefficients that result in the least- 
squares fit for a second-order polynomial with a zero in- 
tercept. Test the approach by using it to fit the data from 
Table 14.1. 
15.3 Fit a cubic polynomial to the following data: 


© k 
m 
. Ww 
a 
w 
~~ 
a 
oa 
N 
[00] 
(o) 


Along with the coefficients, determine 7? and s, [x 


15.4 Develop an M-file to implement polynomial re- 
gression. Pass the M-file two vectors holding the x and y 
values along with the desired order m. Test it by solving 
Prob. 15.3. 

15.5 For the data from Table P15.5, use polynomial 
regression to derive a predictive equation for dissolved oxy- 
gen concentration as a function of temperature for the case 
where the chloride concentration is equal to zero. Employ a 
polynomial that is of sufficiently high order that the predic- 
tions match the number of significant digits displayed in 
the table. 

15.6 Use multiple linear regression to derive a predictive 
equation for dissolved oxygen concentration as a func- 
tion of temperature and chloride based on the data from 
Table P15.5. Use the equation to estimate the concentration 
of dissolved oxygen for a chloride concentration of 15 g/L at 
T = 12 °C. Note that the true value is 9.09 mg/L. Compute 
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TABLE P15.5 Dissolved oxygen concentration in 
water as a function of temperature (°C) 
and chloride concentration (g/L). 


Dissolved Oxygen (mg/L) for 
Temperature (°C) and Concentration 
of Chloride (g/L) 


T, °c c=O0 g/L c=10g/L c = 20 g/L 
0 14.6 12.9 11.4 
5 12.8 1123 10.3 
10 11.3 10.1 8.96 
15 10.1 9.03 8.08 
20 9.09 8.17 7.35 
25 8.26 7.46 6.73 
30 7.56 6.85 6.20 


the percent relative error for your prediction. Explain pos- 
sible causes for the discrepancy. 

15.7 As compared with the models from Probs. 15.5 and 
15.6, a somewhat more sophisticated model that accounts 
for the effect of both temperature and chloride on dis- 
solved oxygen saturation can be hypothesized as being 
of the form 


o=f(T) +f,(c) 


That is, a third-order polynomial in temperature and a lin- 
ear relationship in chloride is assumed to yield superior re- 
sults. Use the general linear least-squares approach to fit this 
model to the data in Table P15.5. Use the resulting equation 
to estimate the dissolved oxygen concentration for a chloride 
concentration of 15 g/L at T = 12 °C. Note that the true 
value is 9.09 mg/L. Compute the percent relative error for 
your prediction. 

15.8 Use multiple linear regression to fit 


x 0 1 1 2 2 3 3 4 4 
% 0 1 2 1 2 1 2 1 2 
y 15.1 17.9 12.7 25.6 20.5 35.1 29.7 45.4 40.2 


Compute the coefficients, the standard error of the estimate, 
and the correlation coefficient. 

15.9 The following data were collected for the steady flow 
of water in a concrete circular pipe: 


Experiment Diameter, m Slope, m/m Flow, m/s 
1 0.3 0.001 0.04 
2 0.6 0.001 0.24 
3 0.9 0.001 0.69 
4 0.3 0.01 0.13 
5 0.6 0.01 0.82 
6 0.9 0.01 2.38 
7 053: 0.05 0.31 
8 0.6 0.05 1.95 
9 0.9 0.05 5.66 


Use multiple linear regression to fit the following model to 
this data: 


Q=a,D"'S” 


where Q = flow, D = diameter, and S = slope. 
15.10 Three disease-carrying organisms decay exponen- 
tially in seawater according to the following model: 


p(t) = Ae! 4 Be ®t ot Cet 


Use general linear least-squares to estimate the initial con- 
centration of each organism (A, B, and C) given the follow- 
ing measurements: 


0.5 1 2 3 4 5 6 7 9 
pit) 6 4.4 3.2 2.7 2 1.9 1.7 1.4 1.1 


~ 


15.11 The following model is used to represent the effect of 
solar radiation on the photosynthesis rate of aquatic plants: 
mee 
p=p Le™*! 
where P = the photosynthesis rate (mg m~“d~'), P,, = 
the maximum photosynthesis rate (mg m~*d7!), J = solar 
radiation (4E m~s~'), and Z„, = optimal solar radiation 
(HE m~’s~'). Use nonlinear regression to evaluate P „and 7 
based on the following data: 


sat 


I 50 80 130 200 250 350 450 550 700 
P 99 177 202 248 229 219 173 142 72 
15.12 The following data are provided 

x 1 2 3 4 5 
y 2.2 2.8 3.6 4.5 5.5 
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Fit the following model to this data using MATLAB and the 
general linear least-squares model 


y=a+bx+£ 


15.13 In Prob. 14.8 we used transformations to linearize 
and fit the following model: 


= Bax 
y = axe f 


Use nonlinear regression to estimate a, and J, based on the 
following data. Develop a plot of your fit along with the data. 


15.14 Enzymatic reactions are used extensively to charac- 
terize biologically mediated reactions. The following is an 
example of a model that is used to fit such reactions: 


ANK 
v = —— 
° K+ISP 
where v, = the initial rate of the reaction (M/s), [S] = the 
substrate concentration (M), and k,, and K are parameters. 
The following data can be fit with this model: 


[S], M Vo, M/s 
0.01 6.078 x 107" 
0.05 7.595 x 10-9 
0.1 6.063 x 10- 
0.5 5.788 x 10-6 
1 1.737 x 10° 
5 2.423 x 10 
10 2.430 x 10° 
50 2.431 x 10° 

100 2.431 x 10 


(a) Use a transformation to linearize the model and evaluate 
the parameters. Display the data and the model fit on a 
graph. 

(b) Perform the same evaluation as in (a) but use nonlinear 
regression. 

15.15 Given the data 


x 5 10 15 20 25 30 35 40 45 50 
17 24 31 33 37 37 40 40 42 41 


h 


use least-squares regression to fit (a) a straight line, (b) a 
power equation, (c) a saturation-growth-rate equation, and 
(d) a parabola. For (b) and (c), employ transformations to 
linearize the data. Plot the data along with all the curves. Is 
any one of the curves superior? If so, justify. 

15.16 The following data represent the bacterial growth in a 
liquid culture over of number of days: 


Day 0 4 8 12 16 20 
Amount x 
10° 67.38 74.67 82.74 91.69 101.60 112.58 


Find a best-fit equation to the data trend. Try several 
possibilities—linear, quadratic, and exponential. Determine 
the best equation to predict the amount of bacteria after 
35 days. 

15.17 Dynamic viscosity of water u(107 N - s/m’) is related 
to temperature 7(°C) in the following manner: 


ae 


0 5 10 20 30 40 
p 1:787 1.519 1.307 1.002 0.7975 0.6529 


(a) Plot this data. 

(b) Use linear interpolation to predict u at T = 7.5 °C. 

(c) Use polynomial regression to fit a parabola to the data in 
order to make the same prediction. 

15.18 Use general linear least squares to find the best pos- 

sible virial constants (A, and A,) for the following equation 

of state. R = 82.05 mL atm/gmol K, and T = 303 K. 


PY of pig 
gm tT tye 
P(atm) 0.985 1.108 1.363 1.631 
V(ml) 25,000 22,200 18,000 15,000 


15.19 Environmental scientists and engineers dealing with 
the impacts of acid rain must determine the value of the 
ion product of water K, as a function of temperature. Sci- 
entists have suggested the following equation to model this 
relationship: 


a 
T, 


a 


—logio Ko = + blogio Ta + cT, + d 


where T, = absolute temperature (K), and a, b, c, and d are 
parameters. Employ the following data and regression to 
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estimate the parameters with MATLAB. Also, generate a 
plot of predicted K,,, versus the data. 


T (°C) K, 

0 1.164 x 1071 
10 2.950 x 10716 
20 6.846 x 107" 
30 1.467 x 10-4 
40 2.929 x 10-4 


15.20 The distance required to stop an automobile consists 
of both thinking and braking components, each of which 
is a function of its speed. The following experimental data 
were collected to quantify this relationship. Develop best- 
fit equations for both the thinking and braking components. 
Use these equations to estimate the total stopping distance 
for a car traveling at 110 km/hr. 


Speed, km/hr 30 45 60 TD 90 120 
Thinking, m 5.6 8.5 11.1 14.5 16.7 22.4 
Braking, m 5.0 12.3 21.0 32.9 47.6 84.7 


15.21 An investigator has reported the data tabulated below. 
It is known that such data can be modeled by the following 
equation 


x = e04 


where a and b are parameters. Use nonlinear regression to 
determine a and b. Based on your analysis predict y at x = 2.6. 


tad 

m 
N 
w 
> 
on 


2 


ba 
(æ) 
oO 
N 
N 
Ko) 
w 
> 


15.22 Itis known that the data tabulated below can be mod- 
eled by the following equation 


2 
a] 


Use nonlinear regression to determine the parameters a and 
b. Based on your analysis predict y at x = 1.6. 


tad 

(æ) 
oO 
be 
N 
w 
> 


15.23 An investigator has reported the data tabulated below 
for an experiment to determine the growth rate of bacteria 
k (per d), as a function of oxygen concentration c (mg/L). 
It is known that such data can be modeled by the following 


Use nonlinear regression to estimate c, and k, 
the growth rate at c = 2 mg/L. 


and predict 


max 


~ 8 
ro 
ro 


15.24 A material is tested for cyclic fatigue failure whereby 
a stress, in MPa, is applied to the material and the number of 
cycles needed to cause failure is measured. The results are 
in the table below. Use nonlinear regression to fit a power 
model to this data. 


N, cycles 1 10 100 1000 10,000 100,000 1,000,000 


Stress, 


MPa 1100 1000 925 800 625 550 420 


15.25 The following data shows the relationship between 
the viscosity of SAE 70 oil and temperature. Use nonlinear 
regression to fit a power equation to this data. 


26.67 93.33 
1.35 0.085 


148.89 315.56 
0.012 0.00075 


Temperature, 7, °C 
Viscosity, 4, N-s/m? 


15.26 The concentration of E. coli bacteria in a swimming 
area is monitored after a storm: 


t (hr) 4 8 12 16 20 24 
c(CFU/100 mL) 1590 1320 1000 900 650 560 


The time is measured in hours following the end of the storm 
and the unit CFU is a “colony forming unit.” Employ non- 
linear regression to fit an exponential model [Eq. (14.22)] 
to this data. Use the model to estimate (a) the concentration 
at the end of the storm (t = 0) and (b) the time at which the 
concentration will reach 200 CFU/100 mL. 
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15.27 Employ nonlinear regression and the following set 
of pressure-volume data to find the best possible virial con- 
stants (A, and A,) for the equation of state shown below. 
R = 82.05 mL atm/gmol K and T = 303 K. 


PV_,,4:, 42 

Rr ityty 
P (atm) 0.985 1.108 1.363 1.631 
V (mL) 25,000 22,200 18,000 15,000 


15.28 Three disease-carrying organisms decay exponentially 
in lake water according to the following model: 


Pt) = Ae! + Be?! + Ce 


Use nonlinear regression to estimate the initial popula- 
tion of each organism (A, B, and C) given the following 
measurements: 


1 2 3 4 5 6 7 9 
4.4 3.2 2.7 2.2 1.9 1.7 1.4 1.1 


t, hr 


0.5 
pit) 6.0 


15.29 The Antoine equation describes the relation between 
vapor pressure and temperature for pure components as 
B 


MD) =A- Cop 


where p is the vapor pressure, Tis temperature (K), and A, B, 
and C are component-specific constants. Use MATLAB to 
determine the best values for the constants for carbon mon- 
oxide based on the following measurements 


T(K) 50 60 70 80 90 100 110 120 130 
p (Pa) 82 2300 18,500 80,500 2.3x10° 5x10° 9.6x10° 1.5x10® 2.4x10° 


In addition to the constants determine the r? and Sy, for your fit, 
15.30 The following model, based on a simplification of the 
Arrhenius equation, is frequently used in environmental en- 
gineering to parameterize the effect of temperature, T(°C), 
on pollutant decay rates, k (per day), 


k= kop gt-20 
where the parameters k, = the decay rate at 20 °C, and 0 = 


the dimensionless temperature dependence coefficient. The 
following data are collected in the laboratory 


T (°C) 6 12 18 24 30 
k (per d) 0.15 0.20 0.32 0.45 0.70 


(a) Use a transformation to linearize this equation and then 
employ linear regression to estimate k» and 0. (b) Employ 
nonlinear regression to estimate the same parameters. For 
both (a) and (b) employ the equation to predict the reaction 
rate at T= 17 °C. 


Fourier Analysis 


CHAPTER OBJECTIVES 


The primary objective of this chapter is to introduce you to Fourier analysis. The 
subject, which is named after Joseph Fourier, involves identifying cycles or patterns 
within a time series of data. Specific objectives and topics covered in this chapter are 


Understanding sinusoids and how they can be used for curve fitting. 

Knowing how to use least-squares regression to fit a sinusoid to data. 

Knowing how to fit a Fourier series to a periodic function. 

Understanding the relationship between sinusoids and complex exponentials 
based on Euler’s formula. 

Recognizing the benefits of analyzing mathematical function or signals in the 
frequency domain (1.e., as a function of frequency). 

Understanding how the Fourier integral and transform extend Fourier analysis to 
aperiodic functions. 

Understanding how the discrete Fourier transform (DFT) extends Fourier analysis 
to discrete signals. 

Recognizing how discrete sampling affects the ability of the DFT to distinguish 
frequencies. In particular, know how to compute and interpret the Nyquist frequency. 
Recognizing how the fast Fourier transform (FFT) provides a highly efficient 
means to compute the DFT for cases where the data record length is a power of 2. 
Knowing how to use the MATLAB function fft to compute a DFT and understand 
how to interpret the results. 

Knowing how to compute and interpret a power spectrum. 


YOU’VE GOT A PROBLEM 


t the beginning of Chap. 8, we used Newton’s second law and force balances to 
predict the equilibrium positions of three bungee jumpers connected by cords. 
Then, in Chap. 13, we determined the same system’s eigenvalues and eigenvectors 


in order to identify its resonant frequencies and principal modes of vibration. Although this 
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16.1 


analysis certainly provided useful results, it required detailed system information includ- 
ing knowledge of the underlying model and parameters (i.e., the jumpers’ masses and the 
cords’ spring constants). 

So suppose that you have measurements of the jumpers’ positions or velocities at dis- 
crete, equally spaced times (recall Fig. 13.1). Such information is referred to as a time 
series. However, suppose further that you do not know the underlying model or the param- 
eters needed to compute the eigenvalues. For such cases, is there any way to use the time 
series to learn something fundamental about the system’s dynamics? 

In this chapter, we describe such an approach, Fourier analysis, which provides a way 
to accomplish this objective. The approach is based on the premise that more complicated 
functions (e.g., a time series) can be represented by the sum of simpler trigonometric func- 
tions. As a prelude to outlining how this is done, it is useful to explore how data can be fit 
with sinusoidal functions. 


CURVE FITTING WITH SINUSOIDAL FUNCTIONS 


A periodic function f(t) is one for which 


fO=ft+T) (16.1) 


where T is a constant called the period that is the smallest value of time for which Eq. (16.1) 
holds. Common examples include both artificial and natural signals (Fig. 16.1q). 

The most fundamental are sinusoidal functions. In this discussion, we will use the term 
sinusoid to represent any waveform that can be described as a sine or cosine. There is no 


FIGURE 16.1 

Aside from trigonometric functions such as sines and cosines, periodic functions include ide- 
alized waveforms like the square wave depicted in (a). Beyond such artificial forms, periodic 
signals in nature can be contaminated by noise like the air temperatures shown in (b). 


(b) 
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y(t) 4 


(a) 
4 
pae 
Ao 
1 = 
B; sin (wot) 

(0) > 

A, COS (of) 

—1 


(b) 


FIGURE 16.2 

(a) A plot of the sinusoidal function y(t) =A, + C, cos(@ot + 0). For this case, Ay = 1.7, 

C, = 1, @) = 2a/T = 2a/(1.5 s), and 0 = a/3 radians = 1.0472 (= 0.25 s). Other parameters 
used to describe the curve are the frequency f= w,/(2z), which for this case is 

1 cycle/(1.5 s) = 0.6667 Hz and the period T = 1.5 s. (b) An alternative expression of the 
same curve is y(t) = A, + A, COS(Wot) + B, sin(@ ot). The three components of this function 
are depicted in (b), where A, = 0.5 and B, = —0.866. The summation of the three curves in 
(b) yields the single curve in (a). 


clear-cut convention for choosing either function, and in any case, the results will be identi- 
cal because the two functions are simply offset in time by 2/2 radians. For this chapter, we 
will use the cosine, which can be expressed generally as 


f() = Ay + C,cos(aot + 0) (16.2) 


Inspection of Eq. (16.2) indicates that four parameters serve to uniquely characterize the 
sinusoid (Fig. 16.2a): 


The mean value A, sets the average height above the abscissa. 

The amplitude C, specifies the height of the oscillation. 

The angular frequency œ characterizes how often the cycles occur. 

The phase angle (or phase shift) O parameterizes the extent to which the sinusoid is 
shifted horizontally. 
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COS (Wot) 


PAS 


Z 
cos (ovs z) i cli 


aoe: 


FIGURE 16.3 

Graphical depictions of (a) a lagging phase angle and (b) a leading phase angle. Note that 
the lagging curve in (a) can be alternatively described as cos(w@ot + 37/2). In other words, if a 
curve lags by an angle of a, it can also be represented as leading by 27x — a. 


Note that the angular frequency (in radians/time) is related to the ordinary frequency 
f (in cycles/time)' by 


Oy) = 2af (16.3) 


and the ordinary frequency in turn is related to the period T by 
f= T (16.4) 


In addition, the phase angle represents the distance in radians from ¢ = 0 to the point 
at which the cosine function begins a new cycle. As depicted in Fig. 16.3a, a negative 
value is referred to as a lagging phase angle because the curve cos(@pt — 0) begins a new 
cycle @ radians after cos(@ ft). Thus, cos(@ t — 0) is said to lag cos(wot). Conversely, as in 
Fig. 16.3), a positive value is referred to as a leading phase angle. 

Although Eq. (16.2) is an adequate mathematical characterization of a sinusoid, it 
is awkward to work with from the standpoint of curve fitting because the phase shift is 
included in the argument of the cosine function. This deficiency can be overcome by 
invoking the trigonometric identity: 


C,cos(@pt + 0) = C,[cos(@pt)cos(@) — sin(@,t)sin(9)] (16.5) 


' When the time unit is seconds, the unit for the ordinary frequency is a cycle/s or Hertz (Hz). 
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Substituting Eq. (16.5) into Eq. (16.2) and collecting terms gives (Fig. 16.2b) 


S(O) = Ap + A,cos(@ot) + B,sin(@pt) (16.6) 
where 
A, = C,cos(@) B, = —C\sin(@) (16.7) 


Dividing the two parts of Eq. (16.7) gives 


B 
0 = arctan | -—! (16.8) 
(z; 


where, if A, < 0, add z to 0. Squaring and summing Eq. (16.7) leads to 
C = y4 +B (16.9) 


Thus, Eq. (16.6) represents an alternative formulation of Eq. (16.2) that still requires four 
parameters but that is cast in the format of a general linear model [recall Eq. (15.7)]. As we 
will discuss in the next section, it can be simply applied as the basis for a least-squares fit. 

Before proceeding to the next section, however, we should stress that we could have 
employed a sine rather than a cosine as our fundamental model of Eq. (16.2). For example, 


f(t) = Ap + Cisin(ot + ô) 


could have been used. Simple relationships can be applied to convert between the two forms: 


sin(@t + 6) = cos (ot +6- A 


and 


cos(@pt + 6) = sin (wot +8+5) (16.10) 


In other words, 0 = 6 — 2/2. The only important consideration is that one or the other format 
should be used consistently. Thus, we will use the cosine version throughout our discussion. 


16.1.1 Least-Squares Fit of a Sinusoid 
Equation (16.6) can be thought of as a linear least-squares model: 

y =A, +A,cos(@ ot) + B,sin(@ ot) + e (16.11) 
which is just another example of the general model [recall Eq. (15.7)] 

Y = ago + 421 + aZ +++ +A, Z, + € 


where z, = 1, z} = cos(@oft), Z, = sin(@pt), and all other z’s = 0. Thus, our goal is to 
determine coefficient values that minimize 


N 
S= È {y; — [Ay + A,cos(aot) + B,sin(at)]}? 
i=l 
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EXAMPLE 16.1 


The normal equations to accomplish this minimization can be expressed in matrix form as 
[recall Eq. (15.10)] 


N È cos(@gt) È sin(@,f) Ay Dy 
È cos(apt) $ cos (oð $ cos(@pf)sin(@yt)| 2B, $ =4 X ycos(@pf) 
Ysin(@pt) $ cos(@pf)sin(aot) È sin” (œt) B, $ ysin(@pt) 


(16.12) 


These equations can be employed to solve for the unknown coefficients. However, 
rather than do this, we can examine the special case where there are N observations equi- 
spaced at intervals of At and with a total record length of T = (N — 1)At. For this situation, 
the following average values can be determined (see Prob. 16.5): 


È sinw) _ ü È cos(@ot) _ v 
N 7 N ~ 
Ysin’(@ot) 1 Yicos(@pot) 1 
N = N =7 (16.13) 
Yeos(a@pt) sin(@gt) 
=0 
N 
Thus, for equispaced points the normal equations become 
N 0 0] [Ao Èy 
0 N/2 0 B, $ = 3 $y cos(@gt) 


0 0 N/2] [By [Ey sinod 


The inverse of a diagonal matrix is merely another diagonal matrix whose elements are the 
reciprocals of the original. Thus, the coefficients can be determined as 


Al [1/N 0 o0 Èy 
B,4=| 0 2/N 0 Dy cos(aof) 
B, 0 0 2/N] | Èy sin(@pot) 
or 
Ay = ze (16.14) 
A, =+ Jy cos(@pt) (16.15) 
By= 2 Dy sin(@t) (16.16) 


Notice that the first coefficient represents the function’s average value. 


Least-Squares Fit of a Sinusoid 


Problem Statement. The curve in Fig. 16.2a is described by y = 1.7 + cos(4.189t + 
1.0472). Generate 10 discrete values for this curve at intervals of At = 0.15 for the range 
t = 0 to 1.35. Use this information to evaluate the coefficients of Eq. (16.11) by a least- 
squares fit. 
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Solution. The data required to evaluate the coefficients with œ = 4.189 are 


t y y Cos(woft) y sin(@ot) 

0 2.200 2.200 0.000 
0.15 1.595 1.291 0.938 
0.30 1.031 0.319 0.980 
0.45 0.722 -0.223 0.687 
0.60 0.786 -0.636 0.462 
0.75 1.200 -1.200 0.000 
0.90 1.805 -1.460 -1.061 
1.05 2.369 =O. 232 —2.253 
1.20 2.678 0.829 —2.547 
1435 2.614 2.114 -1.536 
Y= 17.000 2.502 —4.330 


These results can be used to determine [Eqs. (16.14) through (16.16)] 


A= ne =17 A= Z 2.502=0.500 B= 4 (—4.330) = —0.866 


Thus, the least-squares fit is 
y = 1.7 + 0.500 cos(@pt) — 0.866 sin(@ ft) 
The model can also be expressed in the format of Eq. (16.2) by calculating [Eq. (16.8)] 
= —0.866)\ _ 
o= arctan ( 5 ay ) 1.0472 


and [Eq. (16.9)] 


C, = V0.5" + (—0.866)" = 1.00 

to give 
y = 1.7 + cos(@pt + 1.0472) 

or alternatively, as a sine by using [Eq. (16.10)] 
y= 1.7 + sin(@t + 2.618) 


The foregoing analysis can be extended to the general model 


f(t) = Ap + A,cos(@pt) + B sin(@ ft) + A,cos(2@ ot) + Bsin(2@ ft) 
+--+ +A,cos(m@ot) + B,,Sin(mapt) 


where, for equally spaced data, the coefficients can be evaluated by 


Èy 
Ay 


A= 2 Dy cos(jæ)t 


B,= 2 Ly sin( jo!) 
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EXAMPLE 16.2 


Although these relationships can be used to fit data in the regression sense (i.e., N > 
2m + 1), an alternative application is to employ them for interpolation or collocation—that 
is, to use them for the case where the number of unknowns 2m + 1 is equal to the number of 
data points N. This is the approach used in the continuous Fourier series, as described next. 


CONTINUOUS FOURIER SERIES 


In the course of studying heat-flow problems, Fourier showed that an arbitrary periodic 
function can be represented by an infinite series of sinusoids of harmonically related 
frequencies. For a function with period T, a continuous Fourier series can be written 


S(O = ag + acos(@gt) + b,sin(@pt) + a,cos(2@ot) + basin(2@gt) +- 


or more concisely, 
CO 

FO =a) + È [a coslko,t) + b,sin(ke,t)] (16.17) 
k=1 


where the angular frequency of the first mode (@) = 2z/T) is called the fundamental 
frequency and its constant multiples 2@,, 3a, etc., are called harmonics. Thus, Eq. (16.17) 
expresses f (t) as a linear combination of the basis functions: 1, cos(@pf), sin(@pf), COS(2@f), 
sin(2@of),.... 

The coefficients of Eq. (16.17) can be computed via 


T 
a,= 3 I f(\cos(kayt) dt (16.18) 


and 


T 
b= 3 f f(t)sin(kangt) dt (16.19) 


T 
a=, I f( dt (16.20) 


Continuous Fourier Series Approximation 


Problem Statement. Use the continuous Fourier series to approximate the square or rect- 
angular wave function (Fig. 16.1a) with a height of 2 and a period T = 2z/a: 


-1 -T/2<t< -T/4 
f®H=%4 1 -T/4<t< T/4 
-1 T/4<t< T/2 
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Solution. Because the average height of the wave is zero, a value of ay = 0 can be obtained 
directly. The remaining coefficients can be evaluated as [Eq. (16.18)] 


T/2 
a,=% J fA coslkot) dt 
T Jmn 


—T/4 T/4 T/2 
=2 - J cos(kat) dt + J cos(ka,t) dt — f cos(kayt) al 
T -T)2 -T4 TI4 

The integrals can be evaluated to give 


4/(kz) fork = 15,9... 
a= $ —4/(ka)  fork=3,7,11,... 
0 for k = even integers 


FIGURE 16.4 

The Fourier series approximation of a square wave. The series of plots shows the summation 
up to and including the (a) first, (b) second, and (c) third terms. The individual terms that were 
added or subtracted at each stage are also shown. 
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Similarly, it can be determined that all the b’s = 0. Therefore, the Fourier series approximation is 


FO =£ cos(wst) — + cosBapt) + £ cos(Swot) — + 


37 52 7a cos(7@t) +: 


The results up to the first three terms are shown in Fig. 16.4. 


Before proceeding, the Fourier series can also be expressed in a more compact form 
using complex notation. This is based on Euler’s formula (Fig. 16.5): 


e*'* = cosx + isin x (16.21) 
where i = V—1, and x is in radians. Equation (16.21) can be used to express the Fourier 
series concisely as (Chapra and Canale, 2010) 


co 
f= Dd) et% (16.22) 
k=- 


where the coefficients are 


T/2 


z=/ fOe" dt (16.23) 


C= = 
k 
T —T/2 


Note that the tildes ~ are included to stress that the coefficients are complex numbers. 
Because it is more concise, we will primarily use the complex form in the rest of the 
chapter. Just remember, that it is identical to the sinusoidal representation. 


FIGURE 16.5 
Graphical depiction of Euler’s formula. The rotating vector is called a phasor. 


Imaginary + 


0 


e? = cos 0 + i sin 0 


sin 0 
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FREQUENCY AND TIME DOMAINS 


To this point, our discussion of Fourier analysis has been limited to the time domain. We 
have done this because most of us are fairly comfortable conceptualizing a function’s 
behavior in this dimension. Although it is not as familiar, the frequency domain provides 
an alternative perspective for characterizing the behavior of oscillating functions. 

Just as amplitude can be plotted versus time, it can also be plotted versus frequency. 
Both types of expression are depicted in Fig. 16.6a, where we have drawn a three- 
dimensional graph of a sinusoidal function: 


f(t) = C,cos (t + A 


FIGURE 16.6 
(a) A depiction of how a sinusoid can be portrayed in the time and the frequency domains. 


The time projection is reproduced in (b), whereas the amplitude-frequency projection is 
reproduced in (c). The phase-frequency projection is shown in (d). 
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In this plot, the magnitude or amplitude of the curve f(t) is the dependent variable, and 
time ¢ and frequency f = @,/2z are the independent variables. Thus, the amplitude and the 
time axes form a time plane, and the amplitude and the frequency axes form a frequency 
plane. The sinusoid can, therefore, be conceived of as existing a distance 1/T out along the 
frequency axis and running parallel to the time axes. Consequently, when we speak about 
the behavior of the sinusoid in the time domain, we mean the projection of the curve onto 
the time plane (Fig. 16.6b). Similarly, the behavior in the frequency domain is merely its 
projection onto the frequency plane. 

As in Fig. 16.6c, this projection is a measure of the sinusoid’s maximum positive 
amplitude C,. The full peak-to-peak swing is unnecessary because of the symmetry. 
Together with the location 1/T along the frequency axis, Fig. 16.6c now defines the 
amplitude and frequency of the sinusoid. This is enough information to reproduce the 
shape and size of the curve in the time domain. However, one more parameter—namely, 
the phase angle—is required to position the curve relative to t = 0. Consequently, a 
phase diagram, as shown in Fig. 16.6d, must also be included. The phase angle is de- 
termined as the distance (in radians) from zero to the point at which the positive peak 
occurs. If the peak occurs after zero, it is said to be delayed (recall our discussion of 
lags and leads in Sec. 16.1), and by convention, the phase angle is given a negative 
sign. Conversely, a peak before zero is said to be advanced and the phase angle is posi- 
tive. Thus, for Fig. 16.6, the peak leads zero and the phase angle is plotted as +2/2. 
Figure 16.7 depicts some other possibilities. 

We can now see that Fig. 16.6c and d provide an alternative way to present or summa- 
rize the pertinent features of the sinusoid in Fig. 16.6a. They are referred to as line spectra. 
Admittedly, for a single sinusoid they are not very interesting. However, when applied to a 
more complicated situation—say, a Fourier series—their true power and value is revealed. 
For example, Fig. 16.8 shows the amplitude and phase line spectra for the square-wave 
function from Example 16.2. 

Such spectra provide information that would not be apparent from the time domain. 
This can be seen by contrasting Fig. 16.4 and Fig. 16.8. Figure 16.4 presents two alter- 
native time domain perspectives. The first, the original square wave, tells us nothing 
about the sinusoids that comprise it. The alternative is to display these sinusoids— 
that is, (4/z) cos(@pt), —(4/3z) cos(3@ t), (4/52) cos(5@ot), etc. This alternative does 
not provide an adequate visualization of the structure of these harmonics. In contrast, 
Fig. 16.8a and b provide a graphic display of this structure. As such, the line spectra 
represent “fingerprints” that can help us to characterize and understand a complicated 
waveform. They are particularly valuable for nonidealized cases where they sometimes 
allow us to discern structure in otherwise obscure signals. In the next section, we will 
describe the Fourier transform that will allow us to extend such analyses to nonperiodic 
waveforms. 


FOURIER INTEGRAL AND TRANSFORM 


Although the Fourier series is a useful tool for investigating periodic functions, there are 
many waveforms that do not repeat themselves regularly. For example, a lightning bolt 
occurs only once (or at least it will be a long time until it occurs again), but it will cause 
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FIGURE 16.7 
Various phases of a sinusoid showing the associated phase line spectra. 


interference with receivers operating on a broad range of frequencies—for example, TVs, 
radios, and shortwave receivers. Such evidence suggests that a nonrecurring signal such 
as that produced by lightning exhibits a continuous frequency spectrum. Because such 
phenomena are of great interest to engineers, an alternative to the Fourier series would be 
valuable for analyzing these aperiodic waveforms. 

The Fourier integral is the primary tool available for this purpose. It can be derived 
from the exponential form of the Fourier series [Eqs. (16.22) and (16.23)]. The transi- 
tion from a periodic to a nonperiodic function can be effected by allowing the period to 
approach infinity. In other words, as T becomes infinite, the function never repeats itself 
and thus becomes aperiodic. If this is allowed to occur, it can be demonstrated (e.g., Van 
Valkenburg, 1974; Hayt and Kemmerly, 1986) that the Fourier series reduces to 


fO= f F(w)e"" dw (16.24) 
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(a) Amplitude and (b) phase line spectra for the square wave from Fig. 16.4. 


and the coefficients become a continuous function of the frequency variable œ, as in 


F(@)= 1 : fe" dt (16.25) 


The function F(œ), as defined by Eq. (16.25), is called the Fourier integral of f(t). In 
addition, Eqs. (16.24) and (16.25) are collectively referred to as the Fourier transform pair. 
Thus, along with being called the Fourier integral, F(@) is also called the Fourier trans- 
form of f(t). In the same spirit, f(t), as defined by Eq. (16.24), is referred to as the inverse 
Fourier transform of F(@). Thus, the pair allows us to transform back and forth between 
the time and the frequency domains for an aperiodic signal. 

The distinction between the Fourier series and transform should now be quite clear. 
The major difference is that each applies to a different class of functions—the series to 
periodic and the transform to nonperiodic waveforms. Beyond this major distinction, the 
two approaches differ in how they move between the time and the frequency domains. The 
Fourier series converts a continuous, periodic time-domain function to frequency-domain 
magnitudes at discrete frequencies. In contrast, the Fourier transform converts a continu- 
ous time-domain function to a continuous frequency-domain function. Thus, the discrete 
frequency spectrum generated by the Fourier series is analogous to a continuous frequency 
spectrum generated by the Fourier transform. 
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16.5 


Now that we have introduced a way to analyze an aperiodic signal, we will take the 
final step in our development. In the next section, we will acknowledge the fact that a 
signal is rarely characterized as a continuous function of the sort needed to implement 
Eq. (16.25). Rather, the data are invariably in a discrete form. Thus, we will now show how 
to compute a Fourier transform for such discrete measurements. 


DISCRETE FOURIER TRANSFORM (DFT) 


In engineering, functions are often represented by a finite set of discrete values. Addi- 
tionally, data are often collected in or converted to such a discrete format. As depicted in 
Fig. 16.9, an interval from 0 to T can be divided into n equispaced subintervals with widths 
of At = T/n. The subscript j is employed to designate the discrete times at which samples 
are taken. Thus, f; designates a value of the continuous function f(t) taken at t;, Note that 
the data points are specified at j = 0, 1, 2,...,— 1. A value is not included at j = n. (See 
Ramirez, 1985, for the rationale for excluding f,,.) 
For the system in Fig. 16.9, a discrete Fourier transform can be written as 


n—1 
F = È feo fork=Oton—-1 (16.26) 
jJ=0 ` 
and the inverse Fourier transform as 
1 n—1 ; ; 
h=5 È Fe” forj=Oton—1 (16.27) 
: k=0 


where œ) = 2z/n. 


FIGURE 16.9 
The sampling points of the discrete Fourier series. 
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Equations (16.26) and (16.27) represent the discrete analogs of Eqs. (16.25) and 
(16.24), respectively. As such, they can be employed to compute both a direct and an in- 
verse Fourier transform for discrete data. Note that the factor 1 /n in Eq. (16.27) is merely a 
scale factor that can be included in either Eq. (16.26) or (16.27), but not both. For example, 
if it is shifted to Eq. (16.26), the first coefficient Fy (which is the analog of the constant ay) 
is equal to the arithmetic mean of the samples. 

Before proceeding, several other aspects of the DFT bear mentioning. The highest 
frequency that can be measured in a signal, called the Nyquist frequency, is half the sam- 
pling frequency. Periodic variations that occur more rapidly than the shortest sampled time 
interval cannot be detected. The lowest frequency you can detect is the inverse of the total 
sample length. 

As an example, suppose that you take 100 samples of data (n = 100 samples) at a 
sample frequency of f, = 1000 Hz (i.e., 1000 samples per second). This means that the 
sample interval is 


I 1 
f, 1000 samples/s 


At= = 0.001 s/sample 


The total sample length is 


1 1 
pat 00 samples ls 


= f = 1000 samples/s =o 


and the frequency increment is 


_ fs _ 1000 samples/s _ 
Ajman 100 samples ~~ á 


The Nyquist frequency is 
Tmax = 9.5 f, = 0.5(1000 Hz) = 500 Hz 


and the lowest detectable frequency is 


__ 1 
Triin = 0.1s 


= 10 Hz 
Thus, for this example, the DFT could detect signals with periods from 1/500 = 0.002 s 
up to 1/10 = 0.1 s. 


16.5.1 Fast Fourier Transform (FFT) 


Although an algorithm can be developed to compute the DFT based on Eq. (16.26), it is 
computationally burdensome because n° operations are required. Consequently, for data 
samples of even moderate size, the direct determination of the DFT can be extremely time 
consuming. 
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FIGURE 16.10 
Plot of number of operations versus sample size for the standard DFT and the FFT. 


The fast Fourier transform, or FFT, is an algorithm that has been developed to com- 
pute the DFT in an extremely economical fashion. Its speed stems from the fact that 
it utilizes the results of previous computations to reduce the number of operations. In 
particular, it exploits the periodicity and symmetry of trigonometric functions to com- 
pute the transform with approximately n log,n operations (Fig. 16.10). Thus, for n = 50 
samples, the FFT is about 10 times faster than the standard DFT. For n = 1000, it is about 
100 times faster. 

The first FFT algorithm was developed by Gauss in the early nineteenth century 
(Heideman et al., 1984). Other major contributions were made by Runge, Danielson, 
Lanczos, and others in the early twentieth century. However, because discrete transforms 
often took days to weeks to calculate by hand, they did not attract broad interest prior to the 
development of the modern digital computer. 

In 1965, J. W. Cooley and J. W. Tukey published a key paper in which they outlined 
an algorithm for calculating the FFT. This scheme, which is similar to those of Gauss and 
other earlier investigators, is called the Cooley-Tukey algorithm. Today, there are a host of 
other approaches that are offshoots of this method. As described next, MATLAB offers a 
function called fft that employs such efficient algorithms to compute the DFT. 


16.5.2 MATLAB Function: fft 


MATLAB’s fft function provides an efficient way to compute the DFT. A simple repre- 
sentation of its syntax is 


F = fft(f, n) 
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where F = a vector containing the DFT, and f = a vector containing the signal. The 
parameter n, which is optional, indicates that the user wants to implement an n-point FFT. 
If f has less than n points, it is padded with zeros and truncated if it has more. 

Note that the elements in F are sequenced in what is called reverse-wrap-around 
order. The first half of the values are the positive frequencies (starting with the constant) 
and the second half are the negative frequencies. Thus, if n = 8, the order is 0, 1, 2, 3, 4, 
—3, —2, —1. The following example illustrates the function’s use to calculate the DFT of 
a simple sinusoid. 


Computing the DFT of a Simple Sinusoid with MATLAB 
Problem Statement. Apply the MATLAB fft function to determine the discrete Fourier 
transform for a simple sinusoid: 

fŒ = 5 + cos(2a(12.5)t) + sin(2a(18.75)r) 


Generate 8 equispaced points with At = 0.02 s. Plot the result versus frequency. 


Solution. Before generating the DFT, we can compute a number of quantities. The sam- 
pling frequency is 


=e eran ee = 
La us "E 


The total sample length is 


_n_ 8samples _ 
tn ~ f, 50 samples/s — eae 


The Nyquist frequency is 
Janas = 9.5 f; = 0.5(50 Hz) = 25 Hz 
and the lowest detectable frequency is 


oa 
Sinin = 6166 6.25 Hz 
Thus, the analysis can detect signals with periods from 1/25 = 0.04 s up to 1/6.25 = 0.16 s. 
So we should be able to detect both the 12.5 and 18.75 Hz signals. 

The following MATLAB statements can be used to generate and plot the sample 
(Fig. 16.11a): 


>> clc 

>> n=8; dt=0.02; fs=1/dt; T = 0.16; 

> tspan=(0:n-1)/fs; 

>> y=5tcos(2*pi*12.5*tspan)+sin(2*pi*18.75*tspan) ; 


Vv 
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FIGURE 16.11 


Results of computing a DFT with MATLAB’s fft function: (a) the sample; and 
plots of the (b) real and (c) imaginary parts of the DFT versus frequency. 


>> subplot(3,1,1); 
>> plot(tspan,y,'-ok',']linewidth' ,2, 'MarkerFaceColor', 'black'); 
>> title('(a) f(t) versus time (s)'); 


As was mentioned at the beginning of Sec. 16.5, notice that tspan omits the last point. 
The fft function can be used to compute the DFT and display the results 


>> Y=fft(y)/n; 


>> Y' 


We have divided the transform by n in order that the first coefficient is equal to the arithme- 
tic mean of the samples. When this code is executed, the results are displayed as 


ans = 
5.0000 
0.0000 - 0.00001 
0.5000 
-0.0000 + 0.5000; 
0 
-0.0000 - 0.5000; 
0.5000 
0.0000 + 0.00001 
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Notice that the first coefficient corresponds to the signal’s mean value. In addition, be- 
cause of the reverse-wrap-around order, the results can be interpreted as in the following 
table: 


Index k Frequency Period Real Imaginary 
1 0 constant 5 0 

2 1 6.25 0.16 0 0 

3 2 125 0.08 0.5 0 

4 3 18.75 0.053333 0 0.5 

5 4 25 0.04 0 0 

6 -3 31.25 0.032 0 -0.5 

7 -2 37 «5 0.026667 0.5 0 

8 -1 43.75 0.022857 0 0 


Notice that the fft has detected the 12.5- and 18.75-Hz signals. In addition, we have high- 
lighted the Nyquist frequency to indicate that the values below it in the table are redundant. 
That is, they are merely reflections of the results below the Nyquist frequency. 

If we remove the constant value, we can plot both the real and imaginary parts of the 
DFT versus frequency 


>> nyquist=fs/2;fmin=1/T; 

>> f = linspace(fmin,nyquist,n/2); 

>> Y(1)=[];YP=Y(1:n/2); 

>> subplot(3,1,2) 

>> stem(f,real(YP),'linewidth' ,2, 'MarkerFaceColor', 'blue') 
>> grid;title('(b) Real component versus frequency' ) 

>> subplot(3,1,3) 

>> stem(f,imag(YP),'linewidth' ,2, 'MarkerFaceColor', 'blue') 
>> grid;title('(b) Imaginary component versus frequency' ) 
>> xlabel('frequency (Hz)') 


As expected (recall Fig. 16.7), a positive peak occurs for the cosine at 12.5 Hz 
(Fig. 16.115), and a negative peak occurs for the sine at 18.75 Hz (Fig. 16.1 1c). 


16.6 


THE POWER SPECTRUM 


Beyond amplitude and phase spectra, power spectra provide another useful way to discern 
the underlying harmonics of seemingly random signals. As the name implies, it derives 
from the analysis of the power output of electrical systems. In terms of the DFT, a power 
spectrum consists of a plot of the power associated with each frequency component versus 
frequency. The power can be computed by summing the squares of the Fourier coefficients: 


where P, is the power associated with each frequency kap. 
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16.4 Computing the Power Spectrum with MATLAB 


Problem Statement. Compute the power spectrum for the simple sinusoid for which the 
DFT was computed in Example 16.3. 


Solution. The following script can be developed to compute the power spectrum: 


% compute the DFT 

cle;clf 

n=8; dt=0.02; 

fs=1/dt;tspan=(0:n-1)/fs; 
y=5+tcos(2*pi*12.5*tspan)+sin(2*pi*18.75*tspan) ; 
Y=fft(y)/n; 

f = (O:n-1)*fs/n; 

Y(1)=[];f(1)=[]; 

% compute and display the power spectrum 
nyquist=fs/2; 

f = (1:n/2)/(n/2)*nyquist; 

Pyy = abs(Y(1:n/2)).42; 

stem(f,Pyy,'linewidth' ,2,'MarkerFaceColor', 'blue' ) 
title('Power spectrum' ) 

xlabel('Frequency (Hz)');ylim([0 0.3]) 


As indicated, the first section merely computes the DFT with the pertinent statements from 
Example 16.3. The second section then computes and displays the power spectrum. As 
in Fig. 16.12, the resulting graph indicates that peaks occur at both 12.5 and 18.75 Hz as 
expected. 


FIGURE 16.12 
Power spectrum for a simple sinusoidal function with frequencies of 12.5 and 18.75 Hz. 
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16.7 CASE STUDY SUNSPOTS 


Background. In 1848, Johann Rudolph Wolf devised a method for quantifying solar ac- 
tivity by counting the number of individual spots and groups of spots on the sun’s surface. 
He computed a quantity, now called a Wolf sunspot number, by adding 10 times the num- 
ber of groups plus the total count of individual spots. As in Fig. 16.13, the data set for the 
sunspot number extends back to 1700. On the basis of the early historical records, Wolf 
determined the cycle’s length to be 11.1 years. Use a Fourier analysis to confirm this result 
by applying an FFT to the data. 


Solution. The data for year and sunspot number are contained in a MATLAB file, 
sunspot .dat. The following statements load the file and assign the year and number infor- 
mation to vectors of the same name: 


>> load sunspot.dat 
>> year =sunspot(:,1);number=sunspot(:,2); 


Before applying the Fourier analysis, it is noted that the data seem to exhibit an upward 
linear trend (Fig. 16.13). MATLAB can be used to remove this trend: 

>> n=length(number) ; 

>> a=polyfit(year,number ,1); 

>> lineartrend=polyval(a,year); 

>> ft=number-lineartrend; 


Next, the fft function is employed to generate the DFT 
F=fft(ft); 
The power spectrum can then be computed and plotted 


fs=1; 
f=(0:n/2)*fs/n; 
pow=abs(F(1:n/2+1)).42; 


FIGURE 16.13 
Plot of Wolf sunspot number versus year. The dashed line indicates a mild, upward linear 
trend. 
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16.7 CASE STUDY continued 
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Power spectrum for Wolf sunspot number versus year. 


plot(f,pow) 


xlabel('Frequency (cycles/year)'); ylabel('Power') 
title('Power versus frequency' ) 


The result, as shown in Fig. 16.14, indicates a peak at a frequency of about 0.0915 cycles/yr. 
This corresponds to a period of 1/0.0915 = 10.93 years. Thus, the Fourier analysis is consis- 
tent with Wolf’s estimate of 11 years. 


PROBLEMS 


16.1 The following equation describes the variations of 
temperature of a tropical lake: 


T(t) = 12.8 + 4 cos (241) $3 sin (2&1) 


What is (a) the mean temperature, (b) the amplitude, and 
(c) the period? 

16.2 The temperature in a pond varies sinusoidally over the 
course of a year. Use linear least-squares regression to fit 
Eq. (16.11) to the following data. Use your fit to determine 
the mean, amplitude, and date of maximum temperature. 
Note that the period is 365 d. 


td 15 45 75 105 135 165 225 255 285 315 345 
T,°C 3.4 4.78.511.7 16 18.7 19.7 17.1 12.7 7.7 5.1 


16.3 The pH in a reactor varies sinusoidally over the course 
of a day. Use least-squares regression to fit Eq. (16.11) to the 
following data. Use your fit to determine the mean, ampli- 
tude, and time of maximum pH. Note that the period is 24 hr 


Time, hr 0 2 4 5 7 9 
pH 76 7.2 7 6.5 Pb 7.2 
Time, hr 12 15 20 22 24 
pH 8.9 9.1 8.9 7.9 7 
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16.4 The solar radiation for Tucson, Arizona, has been tab- 
ulated as 


Time, mo J F M A M J 
Radiation, W/m? 144 188 245 311 351 359 
Time, mo J A S O N D 


Radiation, W/m? 308 287 260 211 159 131 


Assuming each month is 30 days long, fit a sinusoid to these 
data. Use the resulting equation to predict the radiation in 
mid-August. 

16.5 The average values of a function can be determined by 


ja 


Use this relationship to verify the results of Eq. (16.13). 
16.6 In electric circuits, itis common to see current behavior 
in the form of a square wave as shown in Fig. P16.6 (notice 
that square wave differs from the one described in Example 
16.2). Solving for the Fourier series from 


A 
fÀ = {- A 


the Fourier series can be represented as 


0<t<T/2 
T/2<t<T 


o0 


4A 
O=} | : 


nay \@n = Ia 


ii paa be) 


Develop a MATLAB function to generate a plot of the 
first n terms of the Fourier series individually, as well as 
the sum of these six terms. Design your function so that it 
plots the curves from ft = 0 to 4T. Use thin dotted red lines 
for the individual terms and a bold black solid line for the 


f(t) 
1 


FIGURE P16.6 


~Y 


FIGURE P16.7 
A sawtooth wave. 
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FIGURE P16.8 
A triangular wave. 


summation (i.e., 'k-',']inewidth' ,2). The function’s first 
line should be 


function [t,f] = FourierSquare(A0,T,n) 


Let Ap = 1 and T = 0.25 s. 

16.7 Use a continuous Fourier series to approximate the 
sawtooth wave in Fig. P16.7. Plot the first four terms along 
with the summation. In addition, construct amplitude and 
phase line spectra for the first four terms. 

16.8 Use a continuous Fourier series to approximate the tri- 
angular wave form in Fig. P16.8. Plot the first four terms 
along with the summation. In addition, construct amplitude 
and phase line spectra for the first four terms. 

16.9 Use the Maclaurin series expansions for e*, cos x, and 
sin x to prove Euler’s formula [Eq. (16.21)]. 

16.10 A half-wave rectifier can be characterized by 


C= lal sinz- -Z cos 2t- -2 cos 4t 
x 2 3x 15x 
-z2 cos6r—... | 

35m 
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where C, is the amplitude of the wave. 

(a) Plot the first four terms along with the summation. 

(b) Construct amplitude and phase line spectra for the first 
four terms. 

16.11 Duplicate Example 16.3, but for 64 points sampled at 

arate of At = 0.01 s from the function 


f(t) = cos[2x(12.5)t] + cos[27(25)r] 


Use fft to generate a DFT of these values and plot the 
results. 
16.12 Use MATLAB to generate 64 points from the function 


F(t) = cos(10f) + sin(3r) 


from t = 0 to 2z. Add a random component to the signal 
with the function randn. Use fft to generate a DFT of these 
values and plot the results. 

16.13 Use MATLAB to generate 32 points for the sinusoid 
depicted in Fig. 16.2 from ¢ = 0 to 6 s. Compute the DFT 
and create subplots of (a) the original signal, (b) the real 
part, and (c) the imaginary part of the DFT versus frequency. 


16.14 Use the fft function to compute a DFT for the trian- 
gular wave from Prob. 16.8. Sample the wave from t = 0 to 
4T using 128 sample points. 

16.15 Develop an M-file function that uses the fft func- 
tion to generate a power spectrum plot. Use it to solve 
Prob. 16.11. 

16.16 Use the fft function to compute the DFT for the fol- 
lowing function: 


f(t) = 1.5 + 1.8cos(2m(12)t) + 0.8sin(22(20)2) 
— 1.25cos(27(28)2) 


Take n = 64 samples with a sampling frequency of f, = 128 
samples/s. Have your script compute values of Ar, ¢,, Af, 
Fins ANd fnar AS illustrated in Examples 16.3 and 16.4, have 
your script generate plots as in Fig. 16.11 and Fig. 16.12. 
16.17 If you take 128 samples of data (n = 128 samples) 
with a total sample length of t, = 0.4 s, compute the follow- 
ing: (a) the sample frequency, f, (sample/s); (b) the sample 
interval, At (s/sample); (c) the Nyquist frequency, fray (Hz); 
(d) the minimum frequency, f,,;,, (Hz). 


min 


Polynomial Interpolation 


CHAPTER OBJECTIVES 


The primary objective of this chapter is to introduce you to polynomial interpolation. 
Specific objectives and topics covered are 


e Recognizing that evaluating polynomial coefficients with simultaneous equations 
is an ill-conditioned problem. 
Knowing how to evaluate polynomial coefficients and interpolate with 


MATLAB’s polyfit and polyval functions. 

Knowing how to perform an interpolation with Newton’s polynomial. 

Knowing how to perform an interpolation with a Lagrange polynomial. 
Knowing how to solve an inverse interpolation problem by recasting it as a roots 
problem. 

Appreciating the dangers of extrapolation. 

Recognizing that higher-order polynomials can manifest large oscillations. 


YOU’VE GOT A PROBLEM 


f we want to improve the velocity prediction for the free-falling bungee jumper, we might 

expand our model to account for other factors beyond mass and the drag coefficient. As 

was previously mentioned in Sec. 1.4, the drag coefficient can itself be formulated as a 
function of other factors such as the area of the jumper and characteristics such as the air’s 
density and viscosity. 

Air density and viscosity are commonly presented in tabular form as a function of 
temperature. For example, Table 17.1 is reprinted from a popular fluid mechanics textbook 
(White, 1999). 

Suppose that you desired the density at a temperature not included in the table. In 
such a case, you would have to interpolate. That is, you would have to estimate the value 
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17.1 


TABLE 17.1 Density (p), dynamic viscosity (u), and kinematic viscosity (v) as a function of 
temperature (T) at 1 atm as reported by White (1999). 


T,°C p, kg/m? H, N: s/m? v, m?/s 
—40 1,52 1.51 x 10% 0.99 x 107° 
0 1.29 1271%10° 1.33 x 10° 
20 1.20 1.80 x 107 1.50 x107 
50 1.09 1.95 x 107 1.79 x107 
100 0.946 2,17 x10 2.30 x10 
150 0.835 2.38x 10-5 2.85x 10-5 
200 0.746 2.57x 107° 3.45 x107 
250 0.675 2.75x 10-5 4.08x 10-5 
300 0.616 2.93 x 10-5 4.75x 10-5 
400 0.525 3.25 x107 6.20 x 107° 
500 0.457 3.55x 10-5 TAT xX 10 


at the desired temperature based on the densities that bracket it. The simplest approach is 
to determine the equation for the straight line connecting the two adjacent values and use 
this equation to estimate the density at the desired intermediate temperature. Although such 
linear interpolation is perfectly adequate in many cases, error can be introduced when the 
data exhibit significant curvature. In this chapter, we will explore a number of different 
approaches for obtaining adequate estimates for such situations. 


INTRODUCTION TO INTERPOLATION 


You will frequently have occasion to estimate intermediate values between precise data 
points. The most common method used for this purpose is polynomial interpolation. The 
general formula for an (n — 1)th-order polynomial can be written as 

fœ =a + ax + ax? +o H ax! (17.1) 
For n data points, there is one and only one polynomial of order (n — 1) that passes through 
all the points. For example, there is only one straight line (i.e., a first-order polynomial) 
that connects two points (Fig. 17.1a). Similarly, only one parabola connects a set of three 
points (Fig. 17.1b). Polynomial interpolation consists of determining the unique (n — 1) 
th-order polynomial that fits n data points. This polynomial then provides a formula to 
compute intermediate values. 

Before proceeding, we should note that MATLAB represents polynomial coefficients 
in a different manner than Eq. (17.1). Rather than using increasing powers of x, it uses 
decreasing powers as in 


f(x) = pix" + pox”? +++ +p, x +P, (17.2) 


To be consistent with MATLAB, we will adopt this scheme in the following section. 
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EXAMPLE 17.1 


(a) (6) (c) 


FIGURE 17.1 

Examples of interpolating polynomials: (a) first-order (linear) connecting two points, 

(b) second-order (quadratic or parabolic) connecting three points, and (c) third-order (cubic) 
connecting four points. 


17.1.1 Determining Polynomial Coefficients 


A straightforward way for computing the coefficients of Eq. (17.2) is based on the fact that 
n data points are required to determine the n coefficients. As in the following example, this 
allows us to generate n linear algebraic equations that we can solve simultaneously for the 
coefficients. 


Determining Polynomial Coefficients with Simultaneous Equations 


Problem Statement. Suppose that we want to determine the coefficients of the parabola, 
fœ) = px’ + pox + p,, that passes through the last three density values from Table 17.1: 


x,= 300 f(x,) =0.616 
x,=400 f(x,) =0.525 
x, = 500 f(x) = 0.457 
Each of these pairs can be substituted into Eq. (17.2) to yield a system of three equations: 
0.616 = p,(300)} + p,(300) + p, 
0.525 = p,(400)} + p,(400) + p, 
0.457 = p,(500)* + p,(500) + p; 


or in matrix form: 


90,000 300 1 Pi 0.616 
160,000 400 1 P2 p = $ 0.525 
250,000 500 1 P3 0.457 


Thus, the problem reduces to solving three simultaneous linear algebraic equations for 
the three unknown coefficients. A simple MATLAB session can be used to obtain the 
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solution: 


Vv 


> format long 

> A =[90000 300 1;160000 400 1;250000 500 1]; 
> b =[0.616 0.525 0.457]'; 

>> p = A\b 


Vv 


p = 
0.00000115000000 
-0.00171500000000 
1.02700000000000 


Thus, the parabola that passes exactly through the three points is 
f (x) = 0.000001 15x? — 0.001715x + 1.027 


This polynomial then provides a means to determine intermediate points. For example, the 
value of density at a temperature of 350 °C can be calculated as 


f (350) = 0.000001 15(350)? — 0.001715(350) + 1.027 = 0.567625 


Although the approach in Example 17.1 provides an easy way to perform interpola- 
tion, it has a serious deficiency. To understand this flaw, notice that the coefficient matrix 
in Example 17.1 has a decided structure. This can be seen clearly by expressing it in general 
terms: 


xi x) 1 Pı fœ 
x5 x, 1| P y= lfa) (17.3) 
vrl P3 f&s) 


Coefficient matrices of this form are referred to as Vandermonde matrices. Such ma- 
trices are very ill-conditioned. That is, their solutions are very sensitive to round off errors. 
This can be illustrated by using MATLAB to compute the condition number for the coef- 
ficient matrix from Example 17.1 as 


>> cond(A) 
ans = 


5.8932e +006 


This condition number, which is quite large for a 3 x 3 matrix, implies that about six digits 
of the solution would be questionable. The ill-conditioning becomes even worse as the 
number of simultaneous equations becomes larger. 

As a consequence, there are alternative approaches that do not manifest this short- 
coming. In this chapter, we will also describe two alternatives that are well-suited for 
computer implementation: the Newton and the Lagrange polynomials. Before doing this, 
however, we will first briefly review how the coefficients of the interpolating polynomial 
can be estimated directly with MATLAB’s built-in functions. 


17.2 NEWTON INTERPOLATING POLYNOMIAL 433 


17.2 


17.1.2 MATLAB Functions: polyfit and polyval 


Recall from Sec. 14.5.2, that the polyfit function can be used to perform polynomial 
regression. In such applications, the number of data points is greater than the number of 
coefficients being estimated. Consequently, the least-squares fit line does not necessarily 
pass through any of the points, but rather follows the general trend of the data. 

For the case where the number of data points equals the number of coefficients, poly- 
fit performs interpolation. That is, it returns the coefficients of the polynomial that pass 
directly through the data points. For example, it can be used to determine the coefficients 
of the parabola that passes through the last three density values from Table 17.1: 

>> format long 

>> T = [300 400 500]; 


>> density = [0.616 0.525 0.457]; 
>> p = polyfit(T,density,2) 


p = 
0.00000115000000 -0.00171500000000 1.02700000000000 
We can then use the polyval function to perform an interpolation as in 


>> d = polyval(p, 350) 


d= 
0.56762500000000 


These results agree with those obtained previously in Example 17.1 with simultaneous 
equations. 


NEWTON INTERPOLATING POLYNOMIAL 


There are a variety of alternative forms for expressing an interpolating polynomial beyond 
the familiar format of Eq. (17.2). Newton’s interpolating polynomial is among the most 
popular and useful forms. Before presenting the general equation, we will introduce the 
first- and second-order versions because of their simple visual interpretation. 


17.2.1 Linear Interpolation 


The simplest form of interpolation is to connect two data points with a straight line. This 
technique, called linear interpolation, is depicted graphically in Fig. 17.2. Using similar 
triangles, 


fi) — f (x) = Sf (%) —f &) 


xx, X, — X (17.4) 
which can be rearranged to yield 
@) -f @) 
RO =f) + EP wn) 17.5 


which is the Newton linear-interpolation formula. The notation f (x) designates that this is 
a first-order interpolating polynomial. Notice that besides representing the slope of the line 
connecting the points, the term [ f (x2) — f (x,)]/(x, — xı) is a finite-difference approximation 
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fA 

fo) 

fiw 

fœ ; 

A = ee 
FIGURE 17.2 
Graphical depiction of linear interpolation. The shaded areas indicate the similar triangles 
used to derive the Newton linear-interpolation formula [Eq. (17.5)]. 
of the first derivative [recall Eq. (4.20)]. In general, the smaller the interval between the 
data points, the better the approximation. This is due to the fact that, as the interval de- 
creases, a continuous function will be better approximated by a straight line. This charac- 
teristic is demonstrated in the following example. 
EXAMPLE 17.2 Linear Interpolation 


Problem Statement. Estimate the natural logarithm of 2 using linear interpolation. First, 
perform the computation by interpolating between In 1 = O and In 6 = 1.791759. Then, 
repeat the procedure, but use a smaller interval from In 1 to In 4 (1.386294). Note that the 
true value of In 2 is 0.6931472. 


Solution. We use Eq. (17.5) from x, = 1 to x, = 6 to give 


f(2)=0+ L72173 0 (2 — 1) = 0.3583519 


which represents an error of £, = 48.3%. Using the smaller interval from x, = 1 to x, = 4 yields 


f(2)=0+4 1.355224 — 0 (2 — 1) = 0.4620981 
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F(x) 4 
DE 
| True 
WF value fw 
Linear estimates 
(0) 
0 5 ae 


FIGURE 17.3 
Two linear interpolations to estimate In 2. Note how the smaller interval provides a better 
estimate. 


Thus, using the shorter interval reduces the percent relative error to e, = 33.3%. Both 
interpolations are shown in Fig. 17.3, along with the true function. 


17.2.2 Quadratic Interpolation 


The error in Example 17.2 resulted from approximating a curve with a straight line. Con- 
sequently, a strategy for improving the estimate is to introduce some curvature into the line 
connecting the points. If three data points are available, this can be accomplished with a 
second-order polynomial (also called a quadratic polynomial or a parabola). A particularly 
convenient form for this purpose is 


SX) = bi + bx — x) + bx — x)(x — x) (17.6) 


A simple procedure can be used to determine the values of the coefficients. For b, 
Eq. (17.6) with x = x, can be used to compute 


b, =f (x) (17.7) 
Equation (17.7) can be substituted into Eq. (17.6), which can be evaluated at x = x, for 
f (x3) -f (x,) 
b, => a os oe (17.8) 


Finally, Eqs. (17.7) and (17.8) can be substituted into Eq. (17.6), which can be evaluated at 
x =x, and solved (after some algebraic manipulations) for 


fœ) — f (xy) f) -f xp 
X3 = X2 X% — X]) 


b; = EES (17.9) 
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Notice that, as was the case with linear interpolation, b, still represents the slope of the 
line connecting points x, and x,. Thus, the first two terms of Eq. (17.6) are equivalent to 
linear interpolation between x, and x,, as specified previously in Eq. (17.5). The last term, 
b,(x — x)(x — x2), introduces the second-order curvature into the formula. 

Before illustrating how to use Eq. (17.6), we should examine the form of the coef- 
ficient b}. It is very similar to the finite-difference approximation of the second derivative 
introduced previously in Eq. (4.27). Thus, Eq. (17.6) is beginning to manifest a structure 
that is very similar to the Taylor series expansion. That is, terms are added sequentially to 
capture increasingly higher-order curvature. 


Quadratic Interpolation 


Problem Statement. Employ a second-order Newton polynomial to estimate In 2 with the 
same three points used in Example 17.2: 


x,=1 fa) =90 
xX,=4 f(x) = 1.386294 
xX,=6 f(x) = 1.791759 
Solution. Applying Eq. (17.7) yields 
b,=0 
Equation (17.8) gives 


pE Lasni n = 0.4620981 


FIGURE 17.4 
The use of quadratic interpolation to estimate In 2. The linear interpolation from x = 1 to 4 is 
also included for comparison. 


fœ% 
2 


Quadratic estimate 


Linear estimate 
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and Eq. (17.9) yields 


1.791759 — 1.386294 _ 04620981 


b= aaa rs = ~0.0518731 


Substituting these values into Eq. (17.6) yields the quadratic formula 
f(x) = 0 + 0.4620981(x — 1) — 0.0518731@ — 1) — 4) 


which can be evaluated at x = 2 for f,(2) = 0.5658444, which represents a relative error of 
€, = 18.4%. Thus, the curvature introduced by the quadratic formula (Fig. 17.4) improves 
the interpolation compared with the result obtained using straight lines in Example 17.2 
and Fig. 17.3. 


17.2.3 General Form of Newton’s Interpolating Polynomials 


The preceding analysis can be generalized to fit an (n — 1)th-order polynomial to n data 
points. The (n — 1)th-order polynomial is 


Jfa) = bi + bx — x) + +b, — x) & — Xo) +++ & — X,_1) (17.10) 


As was done previously with linear and quadratic interpolation, data points can be used to 
evaluate the coefficients b,, b», . . . , b,,. For an (n — 1)th-order polynomial, n data points are 
required: [x,, f(x,)], [x>, f œ], .--, X f (,)]. We use these data points and the following 
equations to evaluate the coefficients: 


b, =f (x) (17.11) 
by =f [x x] (17.12) 
b; =f [X3, X2, x1] (17.13) 
b, =f [xp Xa- s+ 9 X25 xı] (17.14) 


where the bracketed function evaluations are finite divided differences. For example, the 
first finite divided difference is represented generally as 
fD -f œ) 
fix) = (17.15) 
l J 
The second finite divided difference, which represents the difference of two first divided 
differences, is expressed generally as 


S Exp x] — F ix, xg] 
S [Xp Xp x] = T pe (17.16) 


Similarly, the nth finite divided difference is 


Pare Na Cree, oll T (9, Se (pe, OR, eee, 
Fe Mets N zi; x] _fI n? “~n-1 2) fb, 1 n—2 i (17.17) 
n 
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xX; fœ First Second Third 


xy fœ) a Ey ila Ae al Se ibe X3, Xp, X] 


x 
X2 fo) fxs, x] f[%4, x3, x2] eS 
IB Sf (x3) ae Pr x5] oe ae 


X4 S(%4) 


FIGURE 17.5 
Graphical depiction of the recursive nature of finite divided differences. This representation 
is referred to as a divided difference table. 


These differences can be used to evaluate the coefficients in Eqs. (17.11) through (17.14), 
which can then be substituted into Eq. (17.10) to yield the general form of Newton’s inter- 
polating polynomial: 


Fr) = f Op) + = 1) fa, xl + = XX = Xp) F ix, Xp, x] 
Hee bX) — Xp) WOH HDF Me Spats os 9 Xap Xi] (17.18) 


We should note that it is not necessary that the data points used in Eq. (17.18) be 
equally spaced or that the abscissa values necessarily be in ascending order, as illustrated 
in the following example. However, the points should be ordered so that they are centered 
around and as close as possible to the unknown. Also, notice how Eqs. (17.15) through 
(17.17) are recursive—that is, higher-order differences are computed by taking differences 
of lower-order differences (Fig. 17.5). This property will be exploited when we develop an 
efficient M-file to implement the method. 


Newton Interpolating Polynomial 


Problem Statement. In Example 17.3, data points at x, = 1, x, = 4, and x; = 6 were used 
to estimate In 2 with a parabola. Now, adding a fourth point [x, = 5; f (x,) = 1.609438], 
estimate In 2 with a third-order Newton’s interpolating polynomial. 


Solution. The third-order polynomial, Eq. (17.10) with n = 4, is 
F(X) = b, + bx — x1) + D4 Anan) + BY — x) )(x — X,)(x — x3) 


The first divided differences for the problem are [Eq. (17.15)] 


fly x] = 1.386294 —0 = 0.4620981 


feat ou = : 386294 _ 9 9997306 


f bsg ty] = 1609438 — 1.791759 _ 0,1823216 
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The second divided differences are [Eq. (17.16)] 


Tea ot : aos! eT 
F lg yy] = 21823216 = 0.2027326 _ _ 9.92041 100 


5-4 
The third divided difference is [Eq. (17.17) with n = 4] 
—0.02041100 — (—0.05187311) 


J [Xas X35 X95 X,] = z] = 0.007865529 
Thus, the divided difference table is 
Xi fœ First Second Third 
1 0 0.4620981 —0.05187311 0.007865529 
4 1.386294 0.2027326 —0.02041100 
6 1.791759 0.1823216 
5 1.609438 


The results for f (x,), f [xz x1], f [x3, X2, x,], and f [x,, X3, X2, x,] represent the coefficients 
b,, bp, b}, and b,, respectively, of Eq. (17.10). Thus, the interpolating cubic is 


SF) = 0 + 0.462098 1(x — 1) — 0.051873 11@ — 1) — 4) 
+ 0.007865529(% — 1)(x — 4)(x — 6) 


which can be used to evaluate f,(2) = 0.6287686, which represents a relative error of 
€, = 9.3%. The complete cubic polynomial is shown in Fig. 17.6. 


FIGURE 17.6 
The use of cubic interpolation to estimate In 2. 


F(x)4 
2| 


BO —_y 


HeEN=ln se 


Cubic 
estimate 


440 


POLYNOMIAL INTERPOLATION 


17.2.4 MATLAB M-file: Newtint 


It is straightforward to develop an M-file to implement Newton interpolation. As in Fig. 17.7, 
the first step is to compute the finite divided differences and store them in an array. The dif- 
ferences are then used in conjunction with Eq. (17.18) to perform the interpolation. 

An example of a session using the function would be to duplicate the calculation we 
just performed in Example 17.3: 


>> format long 
>> x=[1465]'; 


FIGURE 17.7 
An M-file to implement Newton interpolation. 


function yint = Newtint(x,y,xx) 

% Newtint: Newton interpolating polynomial 

% yint = Newtint(x,y,xx): Uses an (n - 1)-order Newton 

% interpolating polynomial based on n data points (x, y) 
% to determine a value of the dependent variable (yint) 
% at a given value of the independent variable, xx. 

% input: 

% x= independent variable 

% y = dependent variable 

% xx= value of independent variable at which 

% interpolation is calculated 

% output: 

% yint = interpolated value of dependent variable 


% compute the finite divided differences in the form of a 
% difference table 
n = length(x); 
if length(y)~=n, error('x and y must be same length'); end 
b = zeros(n,n); 
% assign dependent variables to the first column of b. 
b(:,1) = y(:); % the (:) ensures that y is a column vector. 
for j = 2:n 

for i = 1:n-j+1 

b(i,j) = (b(i+1,j-1)-b(i,j-1))/(x(i+j-1)-x(i)); 

end 
end 
% use the finite divided differences to interpolate 
xt = 1; 
yint = b(1,1); 
for j = 1:n-1 

xt = xt*(xx-x(j)); 

yint = yint+b(1,j+1)*xt; 
end 
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>> y = log(x); 
>> Newtint(x,y,2) 


ans = 
0.62876857890841 


LAGRANGE INTERPOLATING POLYNOMIAL 


Suppose we formulate a linear interpolating polynomial as the weighted average of the two 
values that we are connecting by a straight line: 


fM=L,f@) +L f) (17.19) 


where the L’s are the weighting coefficients. It is logical that the first weighting coefficient 
is the straight line that is equal to 1 at x, and O at x,: 


XX, 


A= aay 


Similarly, the second coefficient is the straight line that is equal to 1 at x, and 0 at x,: 


XX, 
X% = 


L, = 


Substituting these coefficients into Eq. (17.19) yields the straight line that connects the 
points (Fig. 17.8): 


f= FASEA EE feti =y; “L F(x) (17.20) 


where the nomenclature f (x) designates that this is a first-order polynomial. Equation (17.20) 
is referred to as the linear Lagrange interpolating polynomial. 

The same strategy can be employed to fit a parabola through three points. For this case 
three parabolas would be used with each one passing through one of the points and equal- 
ing zero at the other two. Their sum would then represent the unique parabola that connects 
the three points. Such a second-order Lagrange interpolating polynomial can be written as 


-x)(x = X3) (x - xx — Xa) 
(x, eae ey fe d+ (Xx) — x) — X3) 


x — X,)(X — x) 


G — X))(%3 — X2) 


ha = f) 


f (3) (17.21) 


Notice how the first term is equal to f (x,) at x, and is equal to zero at x, and x;. The other 
terms work in a similar fashion. 

Both the first- and second-order versions as well as higher-order Lagrange polynomi- 
als can be represented concisely as 


f0 = De (x) fx) (17.22) 
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fœ 


=Y 


FIGURE 17.8 

A visual depiction of the rationale behind Lagrange interpolating polynomials. The figure 
shows the first-order case. Each of the two terms of Eq. (17.20) passes through one of the 
points and is zero at the other. The summation of the two terms must, therefore, be the 
unique straight line that connects the two points. 


where 
n X-X. 
Lœ = I] TEE (17.23) 
j=171 
iti 


where n = the number of data points and J] designates the “product of.” 


Lagrange Interpolating Polynomial 


Problem Statement. Use a Lagrange interpolating polynomial of the first and second 
order to evaluate the density of unused motor oil at T= 15 °C based on the following data: 


x,=0 = f(x) =3.85 
x =20 f(x) = 0.800 


x, = 40 f(x) = 0.212 


Solution. The first-order polynomial [Eq. (17.20)] can be used to obtain the estimate at 
x= 5: 


-15-20 15-0 = 
fA\@= 0 20 3.85 + 0-0 0.800 = 1.5625 
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In a similar fashion, the second-order polynomial is developed as [Eq. (17.21)] 


_ (15 — 20)(15 — 40) (15 — 0)(15 — 40) 
Fie) = (0 — 20)(0 — 40) sled (20 — 0)(20 — 40) 


(15 — 0)(15 — 20) 


0.800 


0.212 = 1.3316875 


(40 — 0)(40 — 20) 


17.3.1 MATLAB M-file: Lagrange 


It is straightforward to develop an M-file based on Eqs. (17.22) and (17.23). As in Fig. 17.9, 
the function is passed two vectors containing the independent (x) and the dependent (y) 
variables. It is also passed the value of the independent variable where you want to interpo- 
late (xx). The order of the polynomial is based on the length of the x vector that is passed. 
If n values are passed, an (n — 1)th order polynomial is fit. 


FIGURE 17.9 
An M-file to implement Lagrange interpolation. 


function yint = Lagrange(x,y,xx) 
% Lagrange: Lagrange interpolating polynomial 
% yint = Lagrange(x,y,xx): Uses an (n - 1)-order 


% Lagrange interpolating polynomial based on n data points 
% to determine a value of the dependent variable (yint) at 
% a given value of the independent variable, xx. 

% input: 

% x= independent variable 


% y = dependent variable 

% xx= value of independent variable at which the 
% interpolation is calculated 

% output: 

% yint = interpolated value of dependent variable 


n = length(x); 
if length(y)~=n, error('x and y must be same length'); end 


s=0; 
fon i= in 
product = y(i); 
for j= 1:n 
W Sy 
product = product*(xx-x(j))/(x(i)-x(j)); 
end 
end 
s = s+product; 
end 


yint = s; 
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17.4 


An example of a session using the function would be to predict the density of air at 
1 atm pressure at a temperature of 15 °C based on the first four values from Table 17.1. 
Because four values are passed to the function, a third-order polynomial would be imple- 
mented by the Lagrange function to give: 


>> format long 
>> T = [-40 0 20 50]; 
>> d = [1.52 1.29 1.2 1.09]; 
>> density = Lagrange(T,d,15) 
density = 

1.22112847222222 


INVERSE INTERPOLATION 


As the nomenclature implies, the f (x) and x values in most interpolation contexts are the 
dependent and independent variables, respectively. As a consequence, the values of the x’s 
are typically uniformly spaced. A simple example is a table of values derived for the function 


fœ =1/x: 


x 1 2 3 4 5 6 7 
fœ 1 0.5 0.3333 0.25 0.2 0.1667 0.1429 


Now suppose that you must use the same data, but you are given a value for f (x) and 
must determine the corresponding value of x. For instance, for the data above, suppose that 
you were asked to determine the value of x that corresponded to f (x) = 0.3. For this case, 
because the function is available and easy to manipulate, the correct answer can be deter- 
mined directly as x = 1/0.3 = 3.3333. 

Such a problem is called inverse interpolation. For a more complicated case, you 
might be tempted to switch the f(x) and x values [i.e., merely plot x versus f(x)] and 
use an approach like Newton or Lagrange interpolation to determine the result. Unfor- 
tunately, when you reverse the variables, there is no guarantee that the values along the 
new abscissa [the f (x)’s] will be evenly spaced. In fact, in many cases, the values will 
be “telescoped.” That is, they will have the appearance of a logarithmic scale with some 
adjacent points bunched together and others spread out widely. For example, for f(x) = 
1 /x the result is 


i 


fœ 0.1429 0.1667 0.2 0.25 0.3333 0.5 
x 7 6 5 4 3 2 


ja 


Such nonuniform spacing on the abscissa often leads to oscillations in the resulting inter- 
polating polynomial. This can occur even for lower-order polynomials. An alternative strategy 
is to fit an nth-order interpolating polynomial, f, (x), to the original data [i.e., with f (x) versus 
x]. In most cases, because the x’s are evenly spaced, this polynomial will not be ill-conditioned. 
The answer to your problem then amounts to finding the value of x that makes this polynomial 
equal to the given f (x). Thus, the interpolation problem reduces to a roots problem! 
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17.5 


For example, for the problem just outlined, a simple approach would be to fit a qua- 
dratic polynomial to the three points: (2, 0.5), (3, 0.3333), and (4, 0.25). The result would be 


f(x) = 0.041667x? — 0.375x + 1.08333 


The answer to the inverse interpolation problem of finding the x corresponding to f (x) = 0.3 
would therefore involve determining the root of 


0.3 = 0.041667x* — 0.375x + 1.08333 


For this simple case, the quadratic formula can be used to calculate 


Pe 0.375 + Vv (-0.375)° — 4(0.041667)0.78333 _ 5.704158 
~ 2(0.041667) ~ 3.295842 


Thus, the second root, 3.296, is a good approximation of the true value of 3.333. If ad- 
ditional accuracy were desired, a third- or fourth-order polynomial along with one of the 
root-location methods from Chaps. 5 or 6 could be employed. 


EXTRAPOLATION AND OSCILLATIONS 


Before leaving this chapter, there are two issues related to polynomial interpolation that 
must be addressed. These are extrapolation and oscillations. 


17.5.1 Extrapolation 


Extrapolation is the process of estimating a value of f(x) that lies outside the range of the 
known base points, X}, x5, ..., X, As depicted in Fig. 17.10, the open-ended nature of 


FIGURE 17.10 


Illustration of the possible divergence of an extrapolated prediction. The extrapolation is 
based on fitting a parabola through the first three known points. 
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extrapolation represents a step into the unknown because the process extends the curve 
beyond the known region. As such, the true curve could easily diverge from the predic- 
tion. Extreme care should, therefore, be exercised whenever a case arises where one must 
extrapolate. 


Dangers of Extrapolation 


Problem Statement. This example is patterned after one originally developed by 
Forsythe, Malcolm, and Moler.' The population in millions of the United States from 1920 
to 2000 can be tabulated as 


Date 1920 1930 1940 1950 1960 1970 1980 1990 2000 
Population 106.46 123.08 132.12 152.27 180.67 205.05 227.23 249.46 281.42 


Fit a seventh-order polynomial to the first 8 points (1920 to 1990). Use it to compute the 
population in 2000 by extrapolation and compare your prediction with the actual result. 
Solution. First, the data can be entered as 
>> t = [1920:10:1990]; 
>> pop = [106.46 123.08 132.12 152.27 180.67 205.05 227.23 
249.46]; 
The polyfit function can be used to compute the coefficients 
>> p = polyfit(t, pop, 7) 
However, when this is implemented, the following message is displayed: 
Warning: Polynomial is badly conditioned. Remove repeated data points or try centering 
and scaling as described in HELP POLYFIT. 
We can follow MATLAB’s suggestion by scaling and centering the data values as in 
>> ts = (t - 1955)/35; 
Now polyfit works without an error message: 
>> p = polyfit(ts, pop, 7); 
We can then use the polynomial coefficients along with the polyval function to predict the 
population in 2000 as 
>> polyval(p, (2000-1955) /35) 
ans = 
175.0800 
which is much lower that the true value of 281.42. Insight into the problem can be gained 
by generating a plot of the data and the polynomial, 


>> tt = linspace (1920, 2000) ; 
>> pp = polyval(p,(tt-1955)/35); 
>> plot(t,pop,'o', tt, pp) 


' Cleve Moler is one of the founders of The MathWorks, Inc., the makers of MATLAB. 
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FIGURE 17.11 
Use of a seventh-order polynomial to make a prediction of U.S. population in 2000 based on 
data from 1920 through 1990. 


As in Fig. 17.11, the result indicates that the polynomial seems to fit the data nicely 
from 1920 to 1990. However, once we move beyond the range of the data into the realm of 
extrapolation, the seventh-order polynomial plunges to the erroneous prediction in 2000. 


EXAMPLE 17.7 


17.5.2 Oscillations 


Although “more is better” in many contexts, it is absolutely not true for polynomial inter- 
polation. Higher-order polynomials tend to be very ill-conditioned—that is, they tend to 
be highly sensitive to round off error. The following example illustrates this point nicely. 


Dangers of Higher-Order Polynomial Interpolation 


Problem Statement. In 1901, Carl Runge published a study on the dangers of higher- 
order polynomial interpolation. He looked at the following simple-looking function: 


1 

f@= Le: (17.24) 
which is now called Runge’s function. He took equidistantly spaced data points from this 
function over the interval [-1, 1]. He then used interpolating polynomials of increasing 
order and found that as he took more points, the polynomials and the original curve dif- 
fered considerably. Further, the situation deteriorated greatly as the order was increased. 
Duplicate Runge’s result by using the polyfit and polyval functions to fit fourth- and tenth- 
order polynomials to 5 and 11 equally spaced points generated with Eq. (17.24). Create 
plots of your results along with the sampled values and the complete Runge’s function. 
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Solution. The five equally spaced data points can be generated as in 


>> x 
>> y 


linspace (-1,1,5); 
1./(1+25*x.42); 


Next, a more finally spaced vector of xx values can be computed so that we can create a 

smooth plot of the results: 

>> xx = linspace(-1,1); 

Recall that 1inspace automatically creates 100 points if the desired number of points is not 

specified. The polyfit function can be used to generate the coefficients of the fourth-order 

polynomial, and the polval function can be used to generate the polynomial interpolation 

at the finely spaced values of xx: 

>> p = polyfit(x,y,4); 

>> y4 = polyval(p,xx); 

Finally, we can generate values for Runge’s function itself and plot them along with the 

polynomial fit and the sampled data: 

>> yr = 1./(1+25*xx.42); 

>> plot(x,y, '0',xx,y4,xx,yr,'--") 

As in Fig. 17.12, the polynomial does a poor job of following Runge’s function. 
Continuing with the analysis, the tenth-order polynomial can be generated and plotted 

with 


>> x = linspace(-1,1,11); 
>> y = 1./(1+25*x.42); 
FIGURE 17.12 


Comparison of Runge’s function (dashed line) with a fourth-order polynomial fit to 5 points 
sampled from the function. 
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FIGURE 17.13 


Comparison of Runge’s function (dashed line) with a tenth-order polynomial fit to 11 points 


sampled from the function. 


>> p = polyfit(x,y,10); 
>> y10 = polyval (p,xx); 


>> plot(x,y,'o',xx,yl0,xx,yr,'--') 


As in Fig. 17.13, the fit has gotten even worse, particularly at the ends of the interval! 

Although there may be certain contexts where higher-order polynomials are neces- 
sary, they are usually to be avoided. In most engineering and scientific contexts, lower- 
order polynomials of the type described in this chapter can be used effectively to capture 
the curving trends of data without suffering from oscillations. 


PROBLEMS 


17.1 The following data come from a table that was mea- 
sured with high precision. Use the best numerical method 
(for this type of problem) to determine y at x = 3.5. Note that 
a polynomial will yield an exact value. Your solution should 
prove that your result is exact. 


8.2 9.2 12 
2.015 2.54 8 


1.8 5 6 
16.415 5.375 3.5 


x 0 
y 26 


17.2 Use Newton’s interpolating polynomial to determine y at 
x = 3.5 tothe best possible accuracy. Compute the finite divided 
differences as in Fig. 17.5, and order your points to attain op- 
timal accuracy and convergence. That is, the points should be 
centered around and as close as possible to the unknown. 


1 240 3 4.5 5 6 


x 0 
y 2 5.4375 7.3516 7.5625 8.4453 9.1875 12 
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17.3 Use Newton’s interpolating polynomial to determine y at 
x = 8 to the best possible accuracy. Compute the finite divided 
differences as in Fig. 17.5, and order your points to attain op- 
timal accuracy and convergence. That is, the points should be 
centered around and as close as possible to the unknown. 


x 0 1 2 Bi 11 13 16 18 
y 0.5 3.134 5.3 9.9 10.2 9.35 7.2 6.2 
17.4 Given the data 

x 1 2 25 3 4 5 
fœ 0 5 7 6.5 2 0 


(a) Calculate f (3.4) using Newton’s interpolating polynomi- 
als of order 1 through 3. Choose the sequence of the points 
for your estimates to attain the best possible accuracy. 
That is, the points should be centered around and as close 
as possible to the unknown. 

(b) Repeat (a) but use the Lagrange polynomial. 

17.5 Given the data 


1 2 3 5 6 
(x) 4.75 4 Di2 19.75 3 


we 
(e2) 


Calculate f (4) using Newton’s interpolating polynomials of 

order 1 through 4. Choose your base points to attain good 

accuracy. That is, the points should be centered around and 

as close as possible to the unknown. What do your results 

indicate regarding the order of the polynomial used to gener- 

ate the data in the table? 

17.6 Repeat Prob. 17.5 using the Lagrange polynomial of 

order 1 through 3. 

17.7 Table P15.5 lists values for dissolved oxygen concen- 

tration in water as a function of temperature and chloride 

concentration. 

(a) Use quadratic and cubic interpolation to determine the 
oxygen concentration for T = 12 °C and c = 10 g/L. 

(b) Use linear interpolation to determine the oxygen con- 
centration for T = 12 °C and c = 15 g/L. 

(c) Repeat (b) but use quadratic interpolation. 

17.8 Employ inverse interpolation using a cubic interpolat- 

ing polynomial and bisection to determine the value of x that 

corresponds to f (x) = 1.7 for the following tabulated data: 


1 2 3 4 5 6 7 
œ) 3.6 1.8 1.2 0.9 0.72 1.5 0.51429 


we 


17.9 Employ inverse interpolation to determine the value of x 
that corresponds to f (x) = 0.93 for the following tabulated data: 


x 0 1 2 3 4 5 
fw) 0 0.5 0:8 0.9 0.941176 0.961538 


Note that the values in the table were generated with the 

function f (x) = x7/(1 + x’). 

(a) Determine the correct value analytically. 

(b) Use quadratic interpolation and the quadratic formula to 
determine the value numerically. 

(c) Use cubic interpolation and bisection to determine the 
value numerically. 

17.10 Use the portion of the given steam table for su- 

perheated water at 200 MPa to find (a) the correspond- 

ing entropy s for a specific volume v of 0.118 with linear 

interpolation, (b) the same corresponding entropy using qua- 

dratic interpolation, and (c) the volume corresponding to an 

entropy of 6.45 using inverse interpolation. 


0.10377 
6.4147 


0.11144 
6.5453 


0.12547 
6.7664 


v, m3/kg 
s, kJ/(kg K) 


17.11 The following data for the density of nitrogen gas 
versus temperature come from a table that was measured 
with high precision. Use first- through fifth-order polynomi- 
als to estimate the density at a temperature of 330 K. What 
is your best estimate? Employ this best estimate and inverse 
interpolation to determine the corresponding temperature. 


T,K 200 250 300 350 400 450 
Density, 1.708 1.367 1.139 0.967 0.854 0.759 
kg/m? 


17.12 Ohm’s law states that the voltage drop V across an 
ideal resistor is linearly proportional to the current i flowing 
through the resister as in V = i R, where R is the resistance. 
However, real resistors may not always obey Ohm’s law. 
Suppose that you performed some very precise experiments 
to measure the voltage drop and corresponding current for a 
resistor. The following results suggest a curvilinear relation- 
ship rather than the straight line represented by Ohm’s law: 


-2 =i 
-637 -96.5 


—0.5 0.5 1 2 
-20.5 20.5 96.5 637 


<~ 


To quantify this relationship, a curve must be fit to the data. 
Because of measurement error, regression would typically be 
the preferred method of curve fitting for analyzing such ex- 
perimental data. However, the smoothness of the relationship, 
as well as the precision of the experimental methods, suggests 
that interpolation might be appropriate. Use a fifth-order inter- 
polating polynomial to fit the data and compute V for i = 0.10. 
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17.13 Bessel functions often arise in advanced engineering 
analyses such as the study of electric fields. Here are some 
selected values for the zero-order Bessel function of the first 


Estimate J, (2.1) using third- and fourth-order interpolating 
polynomials. Determine the percent relative error for each 
case based on the true value, which can be determined with 
MATLAB’s built-in function bessel j. 

17.14 Repeat Example 17.6 but using first-, second-, 
third-, and fourth-order interpolating polynomials to pre- 
dict the population in 2000 based on the most recent data. 
That is, for the linear prediction use the data from 1980 
and 1990, for the quadratic prediction use the data from 
1970, 1980, and 1990, and so on. Which approach yields 
the best result? 

17.15 The specific volume of a superheated steam is listed 
in steam tables for various temperatures. 


T,°C 
v, L/kg 


370 382 394 
5.9313 7.5838 8.8428 


406 
9.796 


418 
10.5311 


Determine v at T = 400 °C. 

17.16 The vertical stress o, under the corner of a rectangular 
area subjected to a uniform load of intensity q is given by the 
solution of Boussinesq’s equation: 


_ q | 2mnVn’? 4+? +1 m+n? +2 
g= 2 2 a ae 2 
4n | m+n +1 +m? mnl 


2mn ym +n? +1 


m+nrt 1 +men 


-1 


+sin 


Because this equation is inconvenient to solve manually, it 
has been reformulated as 


o, = qf{m, n) 


where f(m, n) is called the influence value, and m and n are 
dimensionless ratios, with m = a/z and n = b/z and a and 
b are defined in Fig. P17.16. The influence value is then 
tabulated, a portion of which is given in Table P17.16. If 
a = 4.6 and b = 14, use a third-order interpolating poly- 
nomial to compute o, at a depth 10 m below the cor- 
ner of a rectangular footing that is subject to a total load 


FIGURE P17.16 


of 100 t (metric tons). Express your answer in tonnes per 
square meter. Note that q is equal to the load per area. 


TABLE P17.16 


m n = 1.2 n= 1.4 n = 1.6 

0.1 0.02926 0.03007 0.03058 
0.2 0.05733 0.05894 0.05994 
0.3 0.08323 0.08561 0.08709 
0.4 0.10631 0.10941 0.11135 
0.5 0.12626 0.13003 0.13241 
0.6 0.14309 0.14749 0.15027 
0.7 0.15703 0.16199 0.16515 
0.8 0.16843 0.17389 0.17739 


17.17 You measure the voltage drop V across a resistor for a 
number of different values of current i. The results are 


1.5 ames, 3.0 


oA 
oo 


Use first- through fourth-order polynomial interpolation to 
estimate the voltage drop for i = 2.3. Interpret your results. 
17.18 The current in a wire is measured with great precision 
as a function of time: 


0 0.250 
0 6.24 


0.500 
7.75 


0.750 
4.85 


1.000 
0.0000 


~~ 


Determine i at t = 0.23. 
17.19 The acceleration due to gravity at an altitude y above 
the surface of the earth is given by 


y,m 0 30,000 60,000 90,000 120,000 
g,m/s? 9.8100 9.7487 9.6879 9.6278 9.5682 


Compute g at y = 55,000 m. 
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17.20 Temperatures are measured at various points on a 
heated plate (Table P17.20). Estimate the temperature at 
(a) x = 4, y = 3.2 and (b) x = 4.3, y = 2.7. 


TABLE P17.20 Temperatures (°C) at various points 
on a square heated plate. 


x=0 x=2 x=4 x=6 x=8 
y=0 100.00 90.00 80.00 70.00 60.00 
y=2 85.00 64.49 53.50 48.15 50.00 
y=4 70.00 48.90 38.43 35.03 40.00 
y=6 55.00 38.78 30.39 27.07 30.00 
y=8 40.00 35.00 30.00 25.00 20.00 


17.21 Use the portion of the given steam table for su- 
perheated H,O at 200 MPa to (a) find the corresponding 
entropy s for a specific volume v of 0.108 m*/kg with linear 
interpolation, (b) find the same corresponding entropy using 
quadratic interpolation, and (c) find the volume correspond- 
ing to an entropy of 6.6 using inverse interpolation. 


0.10377 
6.4147 


0.11144 
6.5453 


0.12540 
6.7664 


v (m3/kg) 
s (kJ/kg -K) 


17.22 Develop an M-file function, that uses polyfit and 
polyval for polynomial interpolation. Here is the script you 
can use to test your function 


clear,clc,clf, format compact 
x=[1 2 4 8]; 

fx=@(x) 10*exp(-0.2*x); 
y=fx(x); 

yint=polyint(x,y,3) 

ytrue=fx(3) 
et=abs((ytrue-yint) /ytrue)*100. 


17.23 The following data comes from a table that was mea- 
sured with high precision. Use the Newton interpolating 
polynomial to determine y at x = 3.5. Properly order all the 
points and then develop a divided difference table to com- 
pute the derivatives. Note that a polynomial will yield an 
exact value. Your solution should prove that your result is 
exact. 


x 
jo) 
(e>) 


1 235 3 
16.5 5.375 3.5 


4.5 5 
2.375 3.5 8 


< 
Po 
[o>) 


17.24 The following data are measured precisely: 


2 Zol 
6 7.752 


2.7 3 3.4 
36.576 66 125.168 


Dest 
10.256 


NN 


(a) Use Newton interpolating polynomials to determine z at 
t = 2.5. Make sure that you order your points to attain the 
most accurate results. What do your results tell you regard- 
ing the order of the polynomial used to generate the data? 

(b) Use a third-order Lagrange interpolating polynomial to 
determine y at t = 2.5. 

17.25 The following data for the density of water versus 
temperature come from a table that was measured with 
high precision. Use inverse interpolation to determine the 
temperature corresponding to a density of 0.999245 g/cm’. 
Base your estimate on a third-order interpolating polyno- 
mial (Even though you’re doing this problem by hand, feel 
free to use the MATLAB polyfit function to determine the 
polynomial.) Determine the root with the Newton-Raphson 
method (by hand) with an initial guess of T= 14 °C. Be care- 
ful regarding roundoff errors. 


T,°C 0 4 8 12 16 
Density, 0.99987 1 0.99988 0.99952 0.99897 
g/cm? 
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Splines and Piecewise 
Interpolation 


CHAPTER OBJECTIVES 


The primary objective of this chapter is to introduce you to splines. Specific objectives 
and topics covered are 


Understanding that splines minimize oscillations by fitting lower-order 
polynomials to data in a piecewise fashion. 
Knowing how to develop code to perform a table lookup. 


Recognizing why cubic polynomials are preferable to quadratic and higher-order 
splines. 

Understanding the conditions that underlie a cubic spline fit. 

Understanding the differences between natural, clamped, and not-a-knot end 
conditions. 

Knowing how to fit a spline to data with MATLAB’s built-in functions. 
Understanding how multidimensional interpolation is implemented with MATLAB. 


INTRODUCTION TO SPLINES 


In Chap. 17 (n — 1)th-order polynomials were used to interpolate between n data points. 
For example, for eight points, we can derive a perfect seventh-order polynomial. This 
curve would capture all the meanderings (at least up to and including seventh derivatives) 
suggested by the points. However, there are cases where these functions can lead to errone- 
ous results because of round off error and oscillations. An alternative approach is to apply 
lower-order polynomials in a piecewise fashion to subsets of data points. Such connecting 
polynomials are called spline functions. 

For example, third-order curves employed to connect each pair of data points are called 
cubic splines. These functions can be constructed so that the connections between adjacent 
cubic equations are visually smooth. On the surface, it would seem that the third-order 
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FIGURE 18.1 

A visual representation of a situation where splines are superior to higher-order interpolating 
polynomials. The function to be fit undergoes an abrupt increase at x = 0. Parts (a) through 

(c) indicate that the abrupt change induces oscillations in interpolating polynomials. In contrast, 
because it is limited to straight-line connections, a linear spline (d) provides a much more 
acceptable approximation. 


approximation of the splines would be inferior to the seventh-order expression. You might 
wonder why a spline would ever be preferable. 

Figure 18.1 illustrates a situation where a spline performs better than a higher-order 
polynomial. This is the case where a function is generally smooth but undergoes an abrupt 
change somewhere along the region of interest. The step increase depicted in Fig. 18.1 is 
an extreme example of such a change and serves to illustrate the point. 

Figure 18.1a through c illustrates how higher-order polynomials tend to swing through 
wild oscillations in the vicinity of an abrupt change. In contrast, the spline also connects 
the points, but because it is limited to lower-order changes, the oscillations are kept to a 
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FIGURE 18.2 
The drafting technique of using a spline to draw smooth curves through a series of points. 
Notice how, at the end points, the spline straightens out. This is called a “natural” spline. 


minimum. As such, the spline usually provides a superior approximation of the behavior of 
functions that have local, abrupt changes. 

The concept of the spline originated from the drafting technique of using a thin, flexible 
strip (called a spline) to draw smooth curves through a set of points. The process is depicted 
in Fig. 18.2 for a series of five pins (data points). In this technique, the drafter places paper 
over a wooden board and hammers nails or pins into the paper (and board) at the location of 
the data points. A smooth cubic curve results from interweaving the strip between the pins. 
Hence, the name “cubic spline” has been adopted for polynomials of this type. 

In this chapter, simple linear functions will first be used to introduce some basic con- 
cepts and issues associated with spline interpolation. Then we derive an algorithm for fitting 
quadratic splines to data. This is followed by material on the cubic spline, which is the most 
common and useful version in engineering and science. Finally, we describe MATLAB’s 
capabilities for piecewise interpolation including its ability to generate splines. 


LINEAR SPLINES 


The notation used for splines is displayed in Fig. 18.3. For n data points (i = 1, 2,..., n), 
there are n — 1 intervals. Each interval i has its own spline function, s;(x). For linear splines, 
each function is merely the straight line connecting the two points at each end of the 
interval, which is formulated as 


S,(x) =a, + b(x — x;) (18.1) 
where a; is the intercept, which is defined as 

a =f, (18.2) 
and b, is the slope of the straight line connecting the points: 

b= fh (18.3) 
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FIGURE 18.3 
Notation used to derive splines. Notice that there are n — 1 intervals and n data points. 
where f; is shorthand for f (x;). Substituting Eqs. (18.1) and (18.2) into Eq. (18.3) gives 
50) = f+ LAK aa) (18.4) 
These equations can be used to evaluate the function at any point between x, and x, 
by first locating the interval within which the point lies. Then the appropriate equation is 
used to determine the function value within the interval. Inspection of Eq. (18.4) indicates 
that the linear spline amounts to using Newton’s first-order polynomial [Eq. (17.5)] to 
interpolate within each interval. 
EXAMPLE 18.1 First-Order Splines 


Problem Statement. Fit the data in Table 18.1 with first-order splines. Evaluate the 
function at x = 5. 


TABLE 18.1 Data to be fit with spline functions. 


i x; Í: 


AUNE 


Solution. The data can be substituted into Eq. (18.4) to generate the linear spline functions. 
For example, for the second interval from x = 4.5 to x = 7, the function is 


= 25-10, _ 
S(x) = 1.0+ 70245 (x — 4.5) 
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FIGURE 18.4 


Spline fits of a set of four points. (a) Linear spline, (b) quadratic spline, and (c) cubic spline, 
with a cubic interpolating polynomial also plotted. 


The equations for the other intervals can be computed, and the resulting first-order splines 
are plotted in Fig. 18.4a. The value at x = 5 is 1.3. 


= G5S10 oe 
s = 10+ 592 6-45) = 13 


Visual inspection of Fig. 18.4a indicates that the primary disadvantage of first-order 
splines is that they are not smooth. In essence, at the data points where two splines meet 
(called a knot), the slope changes abruptly. In formal terms, the first derivative of the func- 
tion is discontinuous at these points. This deficiency is overcome by using higher-order 
polynomial splines that ensure smoothness at the knots by equating derivatives at these 
points, as will be discussed subsequently. Before doing that, the following section provides 
an application where linear splines are useful. 
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18.2.1 Table Lookup 


A table lookup is a common task that is frequently encountered in engineering and 
science computer applications. It is useful for performing repeated interpolations from 
a table of independent and dependent variables. For example, suppose that you would 
like to set up an M-file that would use linear interpolation to determine air density at a 
particular temperature based on the data from Table 17.1. One way to do this would be 
to pass the M-file the temperature at which you want the interpolation to be performed 
along with the two adjoining values. A more general approach would be to pass in vec- 
tors containing all the data and have the M-file determine the bracket. This is called a 
table lookup. 

Thus, the M-file would perform two tasks. First, it would search the independent vari- 
able vector to find the interval containing the unknown. Then it would perform the linear 
interpolation using one of the techniques described in this chapter or in Chap. 17. 

For ordered data, there are two simple ways to find the interval. The first is called a 
sequential search. As the name implies, this method involves comparing the desired value 
with each element of the vector in sequence until the interval is located. For data in ascend- 
ing order, this can be done by testing whether the unknown is less than the value being as- 
sessed. If so, we know that the unknown falls between this value and the previous one that 
we examined. If not, we move to the next value and repeat the comparison. Here is a simple 
M-file that accomplishes this objective: 


function yi = TableLook(x, y, xx) 
n = length(x); 
if xx < x(1) | xx > x(n) 
error( ‘Interpolation outside range' ) 
end 
% sequential search 
i=l; 
while(1) 
if xx <= x(i + 1), break, end 
j=it1; 
end 
% linear interpolation 


yi = y(i) + (y(i+1)-y(i))/(x(i+1)-x(i))*(xx-x(i)); 


The table’s independent variables are stored in ascending order in the array x and the 
dependent variables stored in the array y. Before searching, an error trap is included to 
ensure that the desired value xx falls within the range of the x’s. A while. . . break loop 
compares the value at which the interpolation is desired, xx, to determine whether it is 
less than the value at the top of the interval, x(i+1). For cases where xx is in the second 
interval or higher, this will not test true at first. In this case the counter i is incremented 
by one so that on the next iteration, xx is compared with the value at the top of the sec- 
ond interval. The loop is repeated until the xx is less than or equal to the interval’s upper 
bound, in which case the loop is exited. At this point, the interpolation can be performed 
simply as shown. 

For situations for which there are lots of data, the sequential sort is inefficient because 
it must search through all the preceding points to find values. In these cases, a simple 
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alternative is the binary search. Here is an M-file that performs a binary search followed 
by linear interpolation: 
function yi = TableLookBin(x, y, xx) 


n = length(x); 
if xx < x(1) | xx > x(n) 
error('Interpolation outside range' ) 


end 

% binary search 
iL = 1; iU = n; 
while (1) 


if iU - iL <= 1, break, end 
iM = fix((iL + iU) / 2); 
if x(iM) < xx 


iL = iM; 
else 
iU = iM; 
end 
end 


% linear interpolation 

yi = y(iL) + (y(iL+1)-y(iL))/(x(iL+1)-x(iL))*(xx - x(iL)); 

The approach is akin to the bisection method for root location. Just as in bisection, 
the index at the midpoint iM is computed as the average of the first or “lower” index iL = 1 
and the last or “upper” index iU = n. The unknown xx is then compared with the value of 
x at the midpoint x( iM) to assess whether it is in the lower half of the array or in the upper 
half. Depending on where it lies, either the lower or upper index is redefined as being 
the middle index. The process is repeated until the difference between the upper and the 
lower index is less than or equal to zero. At this point, the lower index lies at the lower 
bound of the interval containing xx, the loop terminates, and the linear interpolation is 
performed. 

Here is a MATLAB session illustrating how the binary search function can be applied 
to calculate the air density at 350 °C based on the data from Table 17.1. The sequential 
search would be similar. 

>> T = [-40 0 20 50 100 150 200 250 300 400 500]; 


>> density = [1.52 1.29 1.2 1.09 .946 .935 .746 .675 .616... .525 .457]; 
>> TableLookBin(T,density , 350) 


ans = 
0.5705 


This result can be verified by the hand calculation: 


= 0.525 — 0.616 wasn _ 30 — 
f(350) = 0.616 + 262 — 7S (350 — 300) = 0.5705 


QUADRATIC SPLINES 


To ensure that the nth derivatives are continuous at the knots, a spline of at least n + 1 
order must be used. Third-order polynomials or cubic splines that ensure continuous first 
and second derivatives are most frequently used in practice. Although third and higher 
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derivatives can be discontinuous when using cubic splines, they usually cannot be detected 
visually and consequently are ignored. 

Because the derivation of cubic splines is somewhat involved, we have decided to first 
illustrate the concept of spline interpolation using second-order polynomials. These “qua- 
dratic splines” have continuous first derivatives at the knots. Although quadratic splines 
are not of practical importance, they serve nicely to demonstrate the general approach for 
developing higher-order splines. 

The objective in quadratic splines is to derive a second-order polynomial for each in- 
terval between data points. The polynomial for each interval can be represented generally as 


s(x) =a,+b,(x—x,) +¢,(x— x, (18.5) 


where the notation is as in Fig. 18.3. For n data points (i = 1, 2,..., n), there aren — 1 
intervals and, consequently, 3(n — 1) unknown constants (the a’s, b’s, and c’s) to evaluate. 
Therefore, 3(n — 1) equations or conditions are required to evaluate the unknowns. These 
can be developed as follows: 


1. The function must pass through all the points. This is called a continuity condition. It 
can be expressed mathematically as 


f=4; +b; (x; x) +c (x Xy 
which simplifies to 


a =f, (18.6) 


Therefore, the constant in each quadratic must be equal to the value of the dependent 
variable at the beginning of the interval. This result can be incorporated into Eq. (18.5): 


sœ) = f, +b, & — X;) +c; & —x;) 


Note that because we have determined one of the coefficients, the number of condi- 
tions to be evaluated has now been reduced to 2(n — 1). 

2. The function values of adjacent polynomials must be equal at the knots. This condition 
can be written for knot i + 1 as 


2 2 
fit bi ign — X) +; ig — A = Sig H Oi Oia T Xi) + Ci Gin — aD (18.7) 


This equation can be simplified mathematically by defining the width of the ith 
interval as 


hy = Xin. — %; 
Thus, Eq. (18.7) simplifies to 
f, + bih; + ch? = fist (18.8) 


This equation can be written for the nodes, i= 1,...,m — 1. Since this amounts 
to n — | conditions, it means that there are 2(n — 1) — (n — 1) = n — 1 remaining 
conditions. 
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EXAMPLE 18.2 


3. The first derivatives at the interior nodes must be equal. This is an important condition, 
because it means that adjacent splines will be joined smoothly, rather than in the jagged 
fashion that we saw for the linear splines. Equation (18.5) can be differentiated to yield 


s; (x) = b; + 2c,(x — x;) 
The equivalence of the derivatives at an interior node, i + | can therefore be written as 
b; + 2c,h, = bj (18.9) 


Writing this equation for all the interior nodes amounts to n — 2 conditions. This 
means that there is n — 1 — (n — 2) = 1 remaining condition. Unless we have some 
additional information regarding the functions or their derivatives, we must make an 
arbitrary choice to successfully compute the constants. Although there are a number 
of different choices that can be made, we select the following condition. 

4. Assume that the second derivative is zero at the first point. Because the second deriva- 
tive of Eq. (18.5) is 2c;, this condition can be expressed mathematically as 


c,=0 
The visual interpretation of this condition is that the first two points will be connected 
by a straight line. 

Quadratic Splines 

Problem Statement. Fit quadratic splines to the same data employed in Example 18.1 


(Table 18.1). Use the results to estimate the value at x = 5. 


Solution. For the present problem, we have four data points and n = 3 intervals. There- 
fore, after applying the continuity condition and the zero second-derivative condition, this 
means that 2(4 — 1) — 1 = 5 conditions are required. Equation (18.8) is written for i = 1 
through 3 (with c, = 0) to give 


fit bh, =f, 
fa + bah, + ch, = fy 
f + bh + ch = fy 


Continuity of derivatives, Eq. (18.9), creates an additional 3 — 1 = 2 conditions (again, 
recall that c, = 0): 


bi =b, 
b, + 2c-h; = b; 


The necessary function and interval width values are 


f =25 h =4.5 -3.0 =1.5 
f=10 hy =7.0 — 4.5 =2.5 
h=25 h, = 9.0 — 7.0 = 2.0 


f,=0.5 
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These values can be substituted into the conditions which can be expressed in matrix 
form as 


15 0 0 0 O07 fb -1.5 
0 25 625 0 oļļlb 1.5 
0 0 0 2 4[eab=% -2 
1 -1 0 o0 0] 5, 0 
0 1 5 -1 Olle; 0 


These equations can be solved using MATLAB with the results: 


b,=-1 
b,=-1 c, = 0.64 
b, = 2.2 c,=-1.6 


These results, along with the values for the a’s [Eq. (18.6)], can be substituted into the 
original quadratic equations to develop the following quadratic splines for each interval: 


sıx) = 2.5 — (x — 3) 
s(x) = 1.0 — (x — 4.5) + 0.640% — 4.5} 
s(x) = 2.5 + 2.2(x — 7.0) — 1.6(x — 7.0)° 
Because x = 5 lies in the second interval, we use s, to make the prediction, 


s(5) = 1.0 — (5 — 4.5) + 0.64(5 — 4.5) = 0.66 


The total quadratic spline fit is depicted in Fig. 18.4b. Notice that there are two shortcom- 
ings that detract from the fit: (1) the straight line connecting the first two points and (2) the 
spline for the last interval seems to swing too high. The cubic splines in the next section 
do not exhibit these shortcomings and, as a consequence, are better methods for spline 
interpolation. 


18.4 


CUBIC SPLINES 


As stated at the beginning of the previous section, cubic splines are most frequently used 
in practice. The shortcomings of linear and quadratic splines have already been discussed. 
Quartic or higher-order splines are not used because they tend to exhibit the instabilities 
inherent in higher-order polynomials. Cubic splines are preferred because they provide 
the simplest representation that exhibits the desired appearance of smoothness. 

The objective in cubic splines is to derive a third-order polynomial for each interval 
between knots as represented generally by 


5(x) =a,+b,(x—x,)+¢,~—- x, +d, (x x (18.10) 


Thus, for n data points (i = 1, 2,..., n), there are n — 1 intervals and 4(n — 1) un- 
known coefficients to evaluate. Consequently, 4(n — 1) conditions are required for their 
evaluation. 
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The first conditions are identical to those used for the quadratic case. That is, they are 
set up so that the functions pass through the points and that the first derivatives at the knots 
are equal. In addition to these, conditions are developed to ensure that the second deriva- 
tives at the knots are also equal. This greatly enhances the fit’s smoothness. 

After these conditions are developed, two additional conditions are required to obtain 
the solution. This is a much nicer outcome than occurred for quadratic splines where we 
needed to specify a single condition. In that case, we had to arbitrarily specify a zero sec- 
ond derivative for the first interval, hence making the result asymmetric. For cubic splines, 
we are in the advantageous position of needing two additional conditions and can, there- 
fore, apply them evenhandedly at both ends. 

For cubic splines, these last two conditions can be formulated in several different ways. 
A very common approach is to assume that the second derivatives at the first and last 
knots are equal to zero. The visual interpretation of these conditions is that the function 
becomes a straight line at the end nodes. Specification of such an end condition leads to 
what is termed a “natural” spline. It is given this name because the drafting spline naturally 
behaves in this fashion (Fig. 18.2). 

There are a variety of other end conditions that can be specified. Two of the more popu- 
lar are the clamped condition and the not-a-knot conditions. We will describe these options 
in Sec. 18.4.2. For the following derivation, we will limit ourselves to natural splines. 

Once the additional end conditions are specified, we would have the 4(n — 1) condi- 
tions needed to evaluate the 4(n — 1) unknown coefficients. Whereas it is certainly pos- 
sible to develop cubic splines in this fashion, we will present an alternative approach that 
requires the solution of only n — 1 equations. Further, the simultaneous equations will 
be tridiagonal and hence can be solved very efficiently. Although the derivation of this 
approach is less straightforward than for quadratic splines, the gain in efficiency is well 
worth the effort. 


18.4.1 Derivation of Cubic Splines 


As was the case with quadratic splines, the first condition is that the spline must pass 
through all the data points: 


f= a+ bix x) +0; - x) + d(x; - uy 
which simplifies to 
a, =f, (18.11) 


Therefore, the constant in each cubic must be equal to the value of the dependent variable 
at the beginning of the interval. This result can be incorporated into Eq. (18.10): 


s(x) =f +b E- x) +(x — x,)° + d(x — x,)° (18.12) 


Next, we will apply the condition that each of the cubics must join at the knots. For 
knot i + 1, this can be represented as 


f+ bh, + cih? + dh? = fi, (18.13) 
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where 
h= Xia — X; 


The first derivatives at the interior nodes must be equal. Equation (18.12) is differenti- 
ated to yield 


six) =b, + 2c,(x — x) + 3d; — x? (18.14) 
The equivalence of the derivatives at an interior node, i + 1 can therefore be written as 
b, + 2c;h, + 3d,h? = bj, (18.15) 


The second derivatives at the interior nodes must also be equal. Equation (18.14) can 
be differentiated to yield 


s;(x) = 2c; + 6d; (x — x;) (18.16) 

The equivalence of the second derivatives at an interior node, i + 1 can therefore be written as 
ci + 3d;h; = Ci, (18.17) 
Next, we can solve Eq. (18.17) for d; 


_ Cig Ci 
d,= 3h, (18.18) 


This can be substituted into Eq. (18.13) to give 


he 
Fi + bih; +z Qc; + Cis) = Jai (18.19) 


Equation (18.18) can also be substituted into Eq. (18.15) to give 
bi = bit hj (Cc; + Cia) (18.20) 


Equation (18.19) can be solved for 


i —fi h; 
b= S -3 (2c; + Cai) (18.21) 


The index of this equation can be reduced by 1: 


ee -fia E hı 


bi hi 3 


(2c +c;) (18.22) 


The index of Eq. (18.20) can also be reduced by 1: 
b, = b1 + hC + ¢;) (18.23) 


Equations (18.21) and (18.22) can be substituted into Eq. (18.23) and the result simplified 
to yield 


hy Cy + 2A, + Ae; + AjCi44 


satai he (18.24) 


h 


i 
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This equation can be made a little more concise by recognizing that the terms on the 
right-hand side are finite differences [recall Eq. (17.15)]: 


fih 


fixi x =X =x 


Therefore, Eq. (18.24) can be written as 


hci + (h + hj )e; + hiii = 3 (Fxg xi] — FE; Xia) (18.25) 


l 


Equation (18.25) can be written for the interior knots, i = 2, 3,...,n — 2, which 
results in n — 3 simultaneous tridiagonal equations with n — 1 unknown coefficients, 
Cis Co, +++» C1- Therefore, if we have two additional conditions, we can solve for the c’s. 
Once this is done, Eqs. (18.21) and (18.18) can be used to determine the remaining coef- 
ficients, b and d. 

As stated previously, the two additional end conditions can be formulated in a number 
of ways. One common approach, the natural spline, assumes that the second derivatives 
at the end knots are equal to zero. To see how these can be integrated into the solution 
scheme, the second derivative at the first node [Eq. (18.16)] can be set to zero as in 


sœ) = 0 = 2c, + 6d, (x, — xı) 
Thus, this condition amounts to setting c, equal to zero. 
The same evaluation can be made at the last node: 


Srn) = 0 = 2c,_, + 6d,_,h 


n—-1""n-1 


(18.26) 


Recalling Eq. (18.17), we can conveniently define an extraneous parameter c,, in which 
case Eq. (18.26) becomes 


c,_,+3d,_,h 


n-1 n=i n=l = Cn = 0 


Thus, to impose a zero second derivative at the last node, we set c, = 0. 
The final equations can now be written in matrix form as 


1 Ci 
h, 2(h, +h,) h Co 


Apa 2h + Pty) Pry Cat 
BEEZ 


0 
3(f [3 x2] — flx, x1) | 


(18.27) 


3(f Ly Syl = Fpi Spa) | 
L 0 J 


As shown, the system is tridiagonal and hence efficient to solve. 
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EXAMPLE 18.3 


Natural Cubic Splines 


Problem Statement. Fit cubic splines to the same data used in Examples 18.1 and 18.2 
(Table 18.1). Utilize the results to estimate the value at x = 5. 


Solution. The first step is to employ Eq. (18.27) to generate the set of simultaneous equa- 
tions that will be utilized to determine the c coefficients: 
1 c 0 
hy 2(h, + hy) hy al J 304 Ps, x] — fl x1) 
hy 2(hy +h) hz} | © 3( Ff [X4, x3] — f [x3 X21) 


ie) 
w = 


1] 0 
The necessary function and interval width values are 
f,=25 h,=4.5 -3.0= 1.5 
fp = 1.0 h, =7.0 —4.5 =2.5 
fp=25 h, = 9.0 — 7.0 = 2.0 
f,=905 
These can be substituted to yield 
1 ci 0 
13 8. 25 “| _ J 48 
25 9 2) )63 —4.8 
1 C4 0 
These equations can be solved using MATLAB with the results: 
c, =0 c, = 0.839543726 


c, = —0.766539924 c,=0 
Equations (18.21) and (18.18) can be used to compute the b’s and ď’s 
b; = —1.419771863 d, = 0.186565272 
b, = —0.160456274 d, = —0.214144487 
b, = 0.022053232 d, = 0.127756654 


These results, along with the values for the a’s [Eq. (18.11)], can be substituted into 
Eq. (18.10) to develop the following cubic splines for each interval: 


s(x) = 2.5 — 1.419771863(x — 3) + 0.186565272(x — 3) 

s(x) = 1.0 — 0.160456274(x — 4.5) + 0.839543726(x — 4.5)? 
— 0.214144487(x — 4.5} 

5,(x) = 2.5 + 0.022053232(x — 7.0) — 0.766539924(x — 7.0)? 
+ 0.127756654(x — 7.0} 


The three equations can then be employed to compute values within each interval. For 
example, the value at x = 5, which falls within the second interval, is calculated as 


5,(5) = 1.0 — 0.160456274(5 — 4.5) + 0.839543726(5 — 4.5)” — 0.214144487(5 — 4.5)° 
= 1.102889734. 


The total cubic spline fit is depicted in Fig. 18.4c. 
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The results of Examples 18.1 through 18.3 are summarized in Fig. 18.4. Notice the 
progressive improvement of the fit as we move from linear to quadratic to cubic splines. We 
have also superimposed a cubic interpolating polynomial on Fig. 18.4c. Although the cubic 
spline consists of a series of third-order curves, the resulting fit differs from that obtained 
using the third-order polynomial. This is due to the fact that the natural spline requires zero 
second derivatives at the end knots, whereas the cubic polynomial has no such constraint. 


18.4.2 End Conditions 


Although its graphical basis is appealing, the natural spline is only one of several end con- 
ditions that can be specified for splines. Two of the most popular are 


e Clamped End Condition. This option involves specifying the first derivatives at the first 
and last nodes. This is sometimes called a “clamped” spline because it is what occurs 
when you clamp the end of a drafting spline so that it has a desired slope. For example, 
if zero first derivatives are specified, the spline will level off or become horizontal at 
the ends. 

e “Not-a-Knot” End Condition. A third alternative is to force continuity of the third de- 
rivative at the second and the next-to-last knots. Since the spline already specifies that 
the function value and its first and second derivatives are equal at these knots, speci- 
fying continuous third derivatives means that the same cubic functions will apply to 
each of the first and last two adjacent segments. Since the first internal knots no longer 
represent the junction of two different cubic functions, they are no longer true knots. 
Hence, this case is referred to as the “not-a-knot” condition. It has the additional prop- 
erty that for four points, it yields the same result as is obtained using an ordinary cubic 
interpolating polynomial of the sort described in Chap. 17. 


These conditions can be readily applied by using Eq. (18.25) for the interior knots, 
i=2,3,...,n—2, and using first (1) and last equations (n — 1) as written in Table 18.2. 

Figure 18.5 shows acomparison of the three end conditions as applied to fit the data from 
Table 18.1. The clamped case is set up so that the derivatives at the ends are equal to zero. 

As expected, the spline fit for the clamped case levels off at the ends. In contrast, the 
natural and not-a-knot cases follow the trend of the data points more closely. Notice how 
the natural spline tends to straighten out as would be expected because the second deriva- 
tives go to zero at the ends. Because it has nonzero second derivatives at the ends, the 
not-a-knot exhibits more curvature. 


TABLE 18.2 The first and last equations needed to specify some commonly used end 
conditions for cubic splines. 


Condition First and Last Equations 

Natural c;=0,¢,=0 

Clamped (where f; and fj are the specified first Qhye, + hye, = 3 fax] -3f 
derivatives at the first and last nodes, respectively) hp iCar + 2g rCa = 36, — 3S Bee Xa 
Not-a-knot hacı — (h + hy)cg + hcz = O 


Pay 1Cn-2 = Mya + hpi) Cn + My_2€n = O 
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FIGURE 18.5 
Comparison of the clamped (with zero first derivatives), not-a-knot, and natural splines for the 
data from Table 18.1. 
18.5 PIECEWISE INTERPOLATION IN MATLAB 

MATLAB has several built-in functions to implement piecewise interpolation. The spline 
function performs cubic spline interpolation as described in this chapter. The pchip function 
implements piecewise cubic Hermite interpolation. The interp1 function can also imple- 
ment spline and Hermite interpolation, but can also perform a number of other types of 
piecewise interpolation. 
18.5.1 MATLAB Function: spline 
Cubic splines can be easily computed with the built-in MATLAB function, spline. It has 
the general syntax 

yy = spline(x, y, xx) (18.28) 
where x and y = vectors containing the values that are to be interpolated, and yy = a vector 
containing the results of the spline interpolation as evaluated at the points in the vector xx. 

By default, spline uses the not-a-knot condition. However, if y contains two more 
values than x has entries, then the first and last value in y are used as the derivatives at the 
end points. Consequently, this option provides the means to implement the clamped-end 
condition. 

EXAMPLE 18.4 Splines in MATLAB 


Problem Statement. Runge’s function is a notorious example of a function that cannot be 
fit well with polynomials (recall Example 17.7): 
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Use MATLAB to fit nine equally spaced data points sampled from this function in the 
interval [—1, 1]. Employ (a) a not-a-knot spline and (b) a clamped spline with end slopes 
off; = land f’ |= —4. 


Solution. (a) The nine equally spaced data points can be generated as in 


>> x 
>> y 


linspace(-1,1,9); 
1./(1+25*x.42); 


Next, a more finely spaced vector of values can be generated so that we can create a smooth 
plot of the results as generated with the spline function: 


>> XX 
>> yy 


linspace(-1,1); 
spline(x,y,Xxx); 


Recall that linspace automatically creates 100 points if the desired number of points are 
not specified. Finally, we can generate values for Runge’s function itself and display them 
along with the spline fit and the original data: 


>> yr = 1./(1+25*xx.42); 
>> plot(x,y,'0', xx, yy,xx, yr, '--") 


As in Fig. 18.6, the not-a-knot spline does a nice job of following Runge’s function without 
exhibiting wild oscillations between the points. 


(b) The clamped condition can be implemented by creating a new vector yc that has the 
desired first derivatives as its first and last elements. The new vector can then be used to 


FIGURE 18.6 
Comparison of Runge’s function (dashed line) with a 9-point not-a-knot spline fit generated 
with MATLAB (solid line). 
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FIGURE 18.7 

Comparison of Runge’s function (dashed line) with a 9-point clamped end spline fit generated 
with MATLAB (solid line). Note that first derivatives of 1 and —4 are specified at the left and 
right boundaries, respectively. 


generate and plot the spline fit: 


>> yc = [1 y -4]; 
>> yyc = spline(x,yc,xx); 
>> plot(x,y,'O',xx, yyC,xx,yr,'--') 


As in Fig. 18.7, the clamped spline now exhibits some oscillations because of the artificial 
slopes that we have imposed at the boundaries. In other examples, where we have knowl- 
edge of the true first derivatives, the clamped spline tends to improve the fit. 


18.5.2 MATLAB Function: interp1 


The built-in function interp1 provides a handy means to implement a number of different 
types of piecewise one-dimensional interpolation. It has the general syntax 


yi = interpl(x, y, xi, 'method') 


where x and y = vectors containing values that are to be interpolated, yi = a vector contain- 
ing the results of the interpolation as evaluated at the points in the vector xi, and 'method' = 
the desired method. The various methods are 


e  'nearest'—nearest neighbor interpolation. This method sets the value of an interpo- 
lated point to the value of the nearest existing data point. Thus, the interpolation looks 
like a series of plateaus, which can be thought of as zero-order polynomials. 

e ']inear'—linear interpolation. This method uses straight lines to connect the points. 
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e — 'spline'—piecewise cubic spline interpolation. This is identical to the spline function. 
e 'pchip' and 'cubic'—piecewise cubic Hermite interpolation. 


If the 'method' argument is omitted, the default is linear interpolation. 

The pchip option (short for “piecewise cubic Hermite interpolation”) merits more dis- 
cussion. As with cubic splines, pchip uses cubic polynomials to connect data points with 
continuous first derivatives. However, it differs from cubic splines in that the second de- 
rivatives are not necessarily continuous. Further, the first derivatives at the knots will not 
be the same as for cubic splines. Rather, they are expressly chosen so that the interpolation 
is “shape preserving.” That is, the interpolated values do not tend to overshoot the data 
points as can sometimes happen with cubic splines. 

Therefore, there are trade-offs between the spline and the pchip options. The results of 
using spline will generally appear smoother because the human eye can detect discontinui- 
ties in the second derivative. In addition, it will be more accurate if the data are values of a 
smooth function. On the other hand, pchip has no overshoots and less oscillation if the data 
are not smooth. These trade-offs, as well as those involving the other options, are explored 
in the following example. 


Trade-Offs Using interp1 


Problem Statement. You perform a test drive on an automobile where you alternately 
accelerate the automobile and then hold it at a steady velocity. Note that you never de- 
celerate during the experiment. The time series of spot measurements of velocity can be 
tabulated as 
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Use MATLAB’s interp1 function to fit these data with (a) linear interpolation, (b) nearest 
neighbor, (c) cubic spline with not-a-knot end conditions, and (d) piecewise cubic Hermite 
interpolation. 


Solution. (a) The data can be entered, fit with linear interpolation, and plotted with the 
following commands: 


>> t = [0 20 40 56 68 80 84 96 104 110]; 
>> v = [0 20 20 38 80 80 100 100 125 125]; 
>> tt = linspace(0,110); 

>> vl = interpl1(t,v,tt); 

>> plot(t,v,'o',tt,vl) 


The results (Fig. 18.8a) are not smooth, but do not exhibit any overshoot. 


(b) The commands to implement and plot the nearest neighbor interpolation are 


>> vn = interpl(t,v,tt, 'nearest'); 
>> plot(t,v,'o',tt,vn) 
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(c) spline 


FIGURE 18.8 
Use of several options of the interp1 function to perform piecewise polynomial interpolation on a velocity time 
series for an automobile. 


As in Fig. 18.8, the results look like a series of plateaus. This option is neither a smooth 
nor an accurate depiction of the underlying process. 


(c) The commands to implement the cubic spline are 


>> vs = interpl1(t,v,tt,'spline'); 
>> plot(t,v,'o',tt,vs) 


These results (Fig. 18.8c) are quite smooth. However, severe overshoot occurs at several 
locations. This makes it appear that the automobile decelerated several times during the 
experiment. 
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(d) The commands to implement the piecewise cubic Hermite interpolation are 


>> vh = interp1(t,v,tt,'pchip'); 
>> plot(t,v,'o', tt,vh) 


For this case, the results (Fig. 18.8d) are physically realistic. Because of its shape-preserving 
nature, the velocities increase monotonically and never exhibit deceleration. Although the 
result is not as smooth as for the cubic splines, continuity of the first derivatives at the knots 
makes the transitions between points more gradual and hence more realistic. 


18.6 


MULTIDIMENSIONAL INTERPOLATION 


The interpolation methods for one-dimensional problems can be extended to multidimen- 
sional interpolation. In this section, we will describe the simplest case of two-dimensional 
interpolation in Cartesian coordinates. In addition, we will describe MATLAB’s capabili- 
ties for multidimensional interpolation. 


18.6.1 Bilinear Interpolation 


Two-dimensional interpolation deals with determining intermediate values for functions 
of two variables z = f(x; y,). As depicted in Fig. 18.9, we have values at four points: 
F(X, y), fO y), F(X), Yy), and f(x, y2). We want to interpolate between these points 


FIGURE 18.9 
Graphical depiction of two-dimensional bilinear interpolation where an intermediate value 
(filled circle) is estimated based on four given values (open circles). 
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Two-dimensional bilinear interpolation can be implemented by first applying one-dimensional 
linear interpolation along the x dimension to determine values at x, These values can then 
be used to linearly interpolate along the y dimension to yield the final result at x,, y; 


to estimate the value at an intermediate point f(x;, y;). If we use a linear function, the result 
is a plane connecting the points as in Fig. 18.9. Such functions are called bilinear. 

A simple approach for developing the bilinear function is depicted in Fig. 18.10. First, we 
can hold the y value fixed and apply one-dimensional linear interpolation in the x direction. 
Using the Lagrange form, the result at (x;, y,) is 


fany) = PoE, yı) + EE SG y) (18.29) 
and at (x;, a is 
fany) ==, Fi *2 FO, yy) +5 =a “| Fs, ys) (18.30) 


These points can then be used to linearly interpolate along the y dimension to yield the 
final result: 


f@.¥) =F F Jepy) + 5 — yı 


A single equation can be developed by substituting Eqs. (18.29) and (18.30) into Eq. (18.31) 
to give 


“1 F(x, Y2) (18.31) 


_ Xi 7%. Via Va %i— %1 Yi ~ Yo 
fxi y) = t= w= yf On) + =z, J y Fn YD) 
(18.32) 
Xj 7% Yi Z=% yi 
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EXAMPLE 18.6 


Bilinear Interpolation 


Problem Statement. Suppose you have measured temperatures at a number of coordi- 
nates on the surface of a rectangular heated plate: 


T(2, 1) = 60 TQ, 1) = 57.5 
T(2, 6) = 55 T(9, 6) = 70 


Use bilinear interpolation to estimate the temperature at x, = 5.25 and y, = 4.8. 


Solution. Substituting these values into Eq. (18.32) gives 


— 5.25 -9 48-6 5.25 -2 4.8 -6 
f(5.25, 4.8) = 7-9 1-6 60 + 9-23 i6 57.5 


5.25 —9 4.8-1 5.25 — 2 4.8 -—1 4) _ 
+ i 55 + 9-2 6-1 70 = 61.2143 


2-9 6- 


18.6.2 Multidimensional Interpolation in MATLAB 


MATLAB has two built-in functions for two- and three-dimensional piecewise interpola- 
tion: interp2 and interp3. As you might expect from their names, these functions operate 
in a similar fashion to interp1 (Sec. 18.5.2). For example, a simple representation of the 
syntax of interp2 is 


zi = interp2(x, y, z, xi, yi, 'method') 


where x and y = matrices containing the coordinates of the points at which the values 
in the matrix z are given, zi = a matrix containing the results of the interpolation as 
evaluated at the points in the matrices xi and yi, and method = the desired method. Note 
that the methods are identical to those used by interp1; that is, linear, nearest, spline, 
and cubic. 

As with interp1, if the method argument is omitted, the default is linear interpolation. 
For example, interp2 can be used to make the same evaluation as in Example 18.6 as 


>> x=[2 9]; 

>> y=[1 6]; 

>> z=[60 57.5;55 70]; 

>> interp2(x,y,z,5.25,4.8) 


ans = 
61.2143 
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LAANE HEAT TRANSFER 


Background. Lakes in the temperate zone can become thermally stratified during the 
summer. As depicted in Fig. 18.11, warm, buoyant water near the surface overlies colder, 
denser bottom water. Such stratification effectively divides the lake vertically into two 
layers: the epilimnion and the hypolimnion, separated by a plane called the thermocline. 

Thermal stratification has great significance for environmental engineers and scientists 
studying such systems. In particular, the thermocline greatly diminishes mixing between 
the two layers. As a result, decomposition of organic matter can lead to severe depletion of 
oxygen in the isolated bottom waters. 

The location of the thermocline can be defined as the inflection point of the temperature- 
depth curve—that is, the point at which d’T/dz? = 0. It is also the point at which the absolute 
value of the first derivative or gradient is a maximum. 

The temperature gradient is important in its own right because it can be used in con- 
junction with Fourier’s law to determine the heat flux across the thermocline: 


aT 
a (18.33) 
where J = heat flux [cal/(cm? - s)], a = an eddy diffusion coefficient (cm?/s), p = density 
(=1 g/cm’), and C = specific heat [= 1 cal/(g - C)]. 

In this case study, natural cubic splines are employed to determine the thermocline 
depth and temperature gradient for Platte Lake, Michigan (Table 18.3). The latter is also 
used to determine the heat flux for the case where a = 0.01 cm7/s. 


FIGURE 18.11 
Temperature versus depth during summer for Platte Lake, Michigan. 
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TABLE 18.3 Temperature versus depth during summer for Platte Lake, Michigan. 
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18.7 CASE STUDY (eeaiiiarercxe| 


Solution. As just described, we want to use natural spline end conditions to perform this 
analysis. Unfortunately, because it uses not-a-knot end conditions, the built-in MATLAB 
spline function does not meet our needs. Further, the spline function does not return the 
first and second derivatives we require for our analysis. 

However, it is not difficult to develop our own M-file to implement a natural spline and 
return the derivatives. Such a code is shown in Fig. 18.12. After some preliminary error 
trapping, we set up and solve Eq. (18.27) for the second-order coefficients (c). Notice how 


FIGURE 18.12 
M-file to determine intermediate values and derivatives with a natural spline. Note that the 
diff function employed for error trapping is described in Sec. 21.7.1. 


function [yy,dy,d2] = natspline(x,y,xx) 

% natspline: natural spline with differentiation 

%  [yy,dy,d2] = natspline(x,y,xx): uses a natural cubic spline 
% interpolation to find yy, the values of the underlying function 
% y at the points in the vector xx. The vector x specifies the 
% points at which the data y is given. 

% input: 

% xX = vector of independent variables 

% y = vector of dependent variables 

% xx = vector of desired values of dependent variables 

% 

% 


= interpolated values at xx 
% dy = first derivatives at xx 
= second derivatives at xx 


n = length(x); 
if length(y)~=n, error('x and y must be same length'); end 
if any(diff(x)<=0),error('x not strictly ascending'),end 
m = length(xx); 
b = zeros(n,n); 
aa(1,1) = 1; aa(n,n) = 1; %set up Eq. 18.27 
bb(1)=0; bb(n) =0; 
for i = 2:n-1 
aa (Giclee S (Xen 1) 
aali A nig, = a) ae lulose, aye 
aa(i,it+1) = h(x, ji); 
oe) = She" (nell se ik, UW, os WA) = welll, = aly oe WANE 
end 
c=aa\bb'; %solve for c coefficients 
for i=1:n-1 £%solve for a, b and d coefficients 
ate eye 
Oya) = eG) a ae iy et, WO) Ses a) Se (ee eC) eel aes a) 
at) = (ets a) ety 73 ne A): 
end 


(continued) 
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for i= 1:m ‘%perform interpolations at desired values 
[yy(i),dy(i),d2(i)] = SplineInterp(x, n, a, b, c, d, xx(i)); 
end 
end 
function hh = h(x, 7) 
i) = Sel ee A) = Xe 
end 
function fdd = fd(i, j, x, y) 
wala) = (SAG) = A (ed) = A 
end 
function [yyy,dyy,d2y]=SplineInterp(x, n, a, b, c, d, xi) 
Ger Wi = ale) = al 
if xi >= x(ii) - 0.000001 & xi <= x(ii + 1) + 0.000001 
Yaya Ginel(Ca = Gn pel Gh eal) A eel a) os 
(xe al) eh 
dyv=b( ia) +2*e( 1) a= x (in) 38% (7) Cai x(a) 425 
d2y =2*c(ii)+6*d(ii)*(xi-x(ii)); 
break 
end 
end 
end 


FIGURE 18.12 (Continued) 


we use two subfunctions, h and fd, to compute the required finite differences. Once 
Eq. (18.27) is set up, we solve for the c’s with back division. A loop is then employed to 
generate the other coefficients (a, b, and d). 

At this point, we have all we need to generate intermediate values with the cubic 
equation: 


(OSG4tOC=2) FE = K e= AN 


We can also determine the first and second derivatives by differentiating this equation 
twice to give 


f'(x) = b; + 2c; œ — x;) + 3d; (x — se 
T ODS 2e oda — x) 


As in Fig. 18.12, these equations can then be implemented in another subfunction, 
SplineInterp, to determine the values and the derivatives at the desired intermediate 
values. 
Here is a script file that uses the natspline function to generate the spline and create 
plots of the results: 
iO) Beet AEG) Bal ats). T8 2279272] 


Z = 
[S223 2258, 22 820 6 ies) tli 7/ ali ala al)|e 
zz = linspace(z(1),z(length(z))); 
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18.7 CASE STUDY continued 


[TT,dT,dT2] = natspline(z,T,zz); 
súübplot (ira; DF piot z: Vey’, Wl ZZ) 
title(“(a) I), legend("data, T") 
set(gca, ‘YDir", ‘reverse') ,grid 
subplot(1,3,2),plot(dT,zz) 
title('(b) dT/dz') 

set(gca,'YDir', 'reverse'),grid 
subplot(1,3,3),plot(dT2,zz) 
title('(c) d2T/dz2") 
set(gca,'YDir','reverse'),grid 


As in Fig. 18.13, the thermocline appears to be located at a depth of about 11.5 m. 
We can use root location (zero second derivative) or optimization methods (minimum first 
derivative) to refine this estimate. The result is that the thermocline is located at 11.35 m 
where the gradient is —1.61 °C/m. 


FIGURE 18.13 

Plots of (a) temperature, (b) gradient, and (c) second derivative versus depth (m) generated 
with the cubic spline program. The thermocline is located at the inflection point of the 
temperature-depth curve. 
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The gradient can be used to compute the heat flux across the thermocline with 
Eg (18:33): 
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lm x 86,400 s = 30 cal 


100 cm d cm? -d 


el Ble iix 


The foregoing analysis demonstrates how spline interpolation can be used for engi- 
neering and scientific problem solving. However, it also is an example of numerical differ- 
entiation. As such, it illustrates how numerical approaches from different areas can be used 
in tandem for problem solving. We will be describing the topic of numerical differentiation 


in detail in Chap. 21. 


PROBLEMS 


18.1 Given the data 
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Fit these data with (a) a cubic spline with natural end condi- 
tions, (b) a cubic spline with not-a-knot end conditions, and 
(c) piecewise cubic Hermite interpolation. 

18.2 A reactor is thermally stratified as in the following 
table: 


Depth, m 0 
Temperature, °C 70 70 55 22 #13 10 10 


Based on these temperatures, the tank can be idealized as 
two zones separated by a strong temperature gradient or 
thermocline. The depth of the thermocline can be defined 
as the inflection point of the temperature-depth curve—that 
is, the point at which d°7/dz’ = 0. At this depth, the heat flux 
from the surface to the bottom layer can be computed with 
Fourier’s law: 


-I 
A kg 


Use a clamped cubic spline fit with zero end derivatives to 
determine the thermocline depth. If k = 0.01 cal/ (s - cm - °C) 
compute the flux across this interface. 


18.3 The following is the built-in humps function that MATLAB 
uses to demonstrate some of its numerical capabilities: 


1 1 
x— 0.3} +0.01 («-0.9)? + 0.04 


FO)=— 


The humps function exhibits both flat and steep regions over a 
relatively short x range. Here are some values that have been 
generated at intervals of 0.1 over the range from x = 0 to 1: 


x 0 0.1 0.2 0.3 0.4 0.5 
fœ) 5.176 15.471 45.887 96.500 47.448 19.000 
x 0.6 0.7 0.8 0.9 1 


f(x) 11.692 12.382 17.846 21.703 16.000 


Fit these data with a (a) cubic spline with not-a-knot end 
conditions and (b) piecewise cubic Hermite interpolation. In 
both cases, create a plot comparing the fit with the exact 
humps function. 

18.4 Develop a plot of a cubic spline fit of the following 
data with (a) natural end conditions and (b) not-a-knot end 
conditions. In addition, develop a plot using (c) piecewise 
cubic Hermite interpolation (pchip). 


x 0 100 200 400 
fœ 0 0.82436 1.00000 0.73576 
x 600 800 1000 

fœ 0.40601 0.19915 0.09158 
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In each case, compare your plot with the following equation 
which was used to generate the data: 


— _X_ ,-x/200+1 
fŒ) 700° 


18.5 The following data are sampled from the step function 
depicted in Fig. 18.1: 


-1 —0.6 —0.2 0.2 0.6 
fœ) 0 0 0 1 ak 


tad 


Fit these data with a (a) cubic spline with not-a-knot end con- 
ditions, (b) cubic spline with zero-slope clamped end condi- 
tions, and (c) piecewise cubic Hermite interpolation. In each 
case, create a plot comparing the fit with the step function. 
18.6 Develop an M-file to compute a cubic spline fit with 
natural end conditions. Test your code by using it to dupli- 
cate Example 18.3. 

18.7 The following data were generated with the fifth- 
order polynomial: f(x) = 0.0185x° — 0.444x* + 3.9125x° — 
15.456x° + 27.069x — 14.1: 


1 3 5 6 7 9 
1.000 2.172 4.220 5.430 4.912 9.120 


x 


fœ 


(a) Fit these data with a cubic spline with not-a-knot end 
conditions. Create a plot comparing the fit with the function. 
(b) Repeat (a) but use clamped end conditions where the end 
slopes are set at the exact values as determined by differen- 
tiating the function. 

18.8 Bessel functions often arise in advanced engineering 
and scientific analyses such as the study of electric fields. 
These functions are usually not amenable to straightforward 
evaluation and, therefore, are often compiled in standard 
mathematical tables. For example, 


1.8 2 
0.5815 0.5767 


2.2 
0.556 


2.4 
0.5202 


ta 


JŒ) 0.4708 


Estimate J,(2.1), (a) using an interpolating polynomial and 
(b) using cubic splines. Note that the true value is 0.5683. 
18.9 The following data define the sea-level concentra- 
tion of dissolved oxygen for fresh water as a function of 
temperature: 


T,°C 0 8 16 24 32 40 


o,mg/L 14.621 11.843 9.870 8.418 7.305 6.413 


Use MATLAB to fit the data with (a) piecewise linear inter- 
polation, (b) a fifth-order polynomial, and (¢) a spline. Dis- 
play the results graphically and use each approach to estimate 
o(27). Note that the exact result is 7.986 mg/L. 

18.10 (a) Use MATLAB to fit a cubic spline to the follow- 
ing data to determine y at x = 1.5: 


(b) Repeat (a), but with zero first derivatives at the end knots. 
18.11 Runge’s function is written as 


1 


PO) = 552 


Generate five equidistantly spaced values of this function 
over the interval: [—1, 1]. Fit these data with (a) a fourth- 
order polynomial, (b) a linear spline, and (c) a cubic spline. 
Present your results graphically. 

18.12 Use MATLAB to generate eight points from the 
function 


f(t) =sin’t 


from t = 0 to 2z. Fit these data using (a) cubic spline with 
not-a-knot end conditions, (b) cubic spline with derivative 
end conditions equal to the exact values calculated with dif- 
ferentiation, and (c) piecewise cubic Hermite interpolation. 
Develop plots of each fit as well as plots of the absolute error 
(E, = approximation — true) for each. 

18.13 The drag coefficient for spheres such as sporting balls 
is known to vary as a function of the Reynolds number Re, 
a dimensionless number that gives a measure of the ratio of 
inertial forces to viscous forces: 


_ pVD 


Re 7 


where p = the fluid’s density (kg/m*), V = its velocity (m/s), 
D = diameter (m), and u = dynamic viscosity (N - s/m’). 
Although the relationship of drag to the Reynolds number is 
sometimes available in equation form, it is frequently tabu- 
lated. For example, the following table provides values for a 
smooth spherical ball: 


Re (x104 2 5.8 
C; 0.52 0.52 


16.8 27.2 29.9 33.9 
0.52 0.5 0.49 0.44 


Re(x10-*) 36.3 40 46 60 100 200 400 
Cp 0.18 0.074 0.067 0.08 0.12 0.16 0.19 
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(a) Develop a MATLAB function that employs an appro- 
priate interpolation function to return a value of Cp as a 
function of the Reynolds number. The first line of the func- 
tion should be 


function CDout = Drag(ReCD,ReIn) 


where ReCD = a 2-row matrix containing the table, ReIn = 
the Reynolds number at which you want to estimate the 
drag, and CDout = the corresponding drag coefficient. 

(b) Write a script that uses the function developed in 
part (a) to generate a labeled plot of the drag force ver- 
sus velocity (recall Sec. 1.4). Use the following param- 
eter values for the script: D = 22 cm, p = 1.3 kg/m?, and 
u = 1.78 x 10% Pa- s. Employ a range of velocities from 4 
to 40 m/s for your plot. 

18.14 The following function describes the temperature dis- 
tribution on a rectangular plate for the range —2 < x < 0 and 
O<y<3 


T=2+x-y+2r4+2xy+y 


Develop a script to: (a) Generate a meshplot of this function 
using the MATLAB function surfc. Employ the ]inspace 
function with default spacing (i.e., 100 interior points) to 
generate the x and y values. (b) Use the MATLAB function 
interp2 with the default interpolation option ('linear') to 
compute the temperature at x = —1.63 and y = 1.627. Deter- 
mine the percent relative error of your result. (c) Repeat (b), 
but with 'spline'. Note: for parts (b) and (c), employ the 
linspace function with 9 interior points. 

18.15 The U.S. Standard Atmosphere specifies atmospheric 
properties as a function of altitude above sea level. The fol- 
lowing table shows selected values of temperature, pressure, 
and density 


Altitude (km) T (°C) p (atm) p (kg/m?) 
-0.5 18.4 1.0607 1.2850 
2.5 -1.1 0.73702 0.95697 
6 -23.8 0.46589 0.66015 
11 -56.2 0.22394 0.36481 
20 -56.3 0.054557 0.088911 
28 —48.5 0.015946 0.025076 
50 -2.3 7.8721 x 10% 1.0269 x 10% 
60 —17.2 2.2165 x 10-* 3.0588 x 107 
80 -92.3 1.0227 x 10™ 1.9992 x 10 
90 -92.3 1.6216 x 107 3.1703 x 10- 


Develop a MATLAB function, StdAtm, to determine values 
of the three properties for a given altitude. Base the func- 
tion on the pchip option for interp1. If the user requests 
a value outside the range of altitudes, have the function 
display an error message and terminate the application. 
Use the following script as the starting point to create a 
3-panel plot of altitude versus the properties as depicted in 
Fig. P18.15. 


% Script to generate a plot of temperature, pressure 
and density 
% for the U.S. Standard Atmosphere 
cl, clf 
z=[-0.5 2.5 6 11 20 28 50 60 80 90]; 
T=[18.4 -1.1 -23.8 -56.2 -56.3 -48.5 -2.3 -17.2 -92.3 
-92.3]; 
p=[1.0607 0.73702 0.46589 0.22394 0.054557 0.015946 ... 
7.8721e-4 2.2165e-4 1.02275e-05 1.6216e-06]; 
rho=[1.285025 0.95697 0.6601525 0.364805 0.0889105 ... 
0.02507575 0.001026918 0.000305883 0.000019992 
3.1703e-06]; 
zint=[-0.5:0.1:90]; 
for 1=1:length(zint) 
[Tint(i),pint(i),rint(i)]=StdAtm(z,T,p,rho,zint(i)); 
end 


% Create plot 


Te=StdAtm(z,T,p,rho,-1000) ; 

18.16 Felix Baumgartner ascended to 39 km in a strato- 
spheric balloon and made a free-fall jump rushing toward 
earth at supersonic speeds before parachuting to the ground. 
As he fell, his drag coefficient changed primarily because 
the air density changed. Recall from Chap. 1 that the ter- 
minal velocity, v m/s), of a free-falling object can be 
computed as 


ò _ , [gm 
terminal ~~ C4 


where g = gravitational acceleration (m/s°), m = mass (kg), 
and c, = drag coefficient (kg/m). The drag coefficient can 
be computed as 


terminal ( 


c= 0.5pAC, 


where p = the fluid density (kg/m*), A = the projected 
area (m7), and C = a dimensionless drag coefficient. Note 
that the gravitational acceleration, g (m/s”), can be related to 
elevation by 


g = 9.806412 — 0.003039734z 
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(a) Temperature (°C) (b) Pressure (atm) (c) Density (kg/m?) 
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FIGURE P18.15 


where z = elevation above the earth’s surface (km) and the den- Assume that m = 80 kg, A = 0.55 m’, and C} = 1.1. Develop 
sity of air, p (kg/m°), at various elevations can be tabulated as a MATLAB script to create a labeled plot of terminal veloc- 
ity versus elevation for z=[0:0.5:40]. Use a spline to gener- 


z (km) p (kg/m?)|z (km) p (kg/m?) |z (km) p (kg/m?) ate the required densities needed to construct the plot. 
-1 1.347 6 0.6601 25 0.04008 
0 1.225 7 0.5900 30 0.01841 
1 1.112 8 0.5258 | 40 0.003996 
2 1.007 9 0.4671 50 0.001027 
3 0.9093] 10 0.4135 | 60 0.0003097 
4 0.8194) 15 0.1948 70 8.283 x 10° 
5 0.7364} 20 0.08891} 80 1.846 x 10% 


This page intentionally left blank 


Integration and 
Differentiation 


51 OVERVIEW 


In high school or during your first year of college, you were introduced to differential and 
integral calculus. There you learned techniques to obtain analytical or exact derivatives 
and integrals. 

Mathematically, the derivative represents the rate of change of a dependent variable 
with respect to an independent variable. For example, if we are given a function y(t) that 
specifies an object’s position as a function of time, differentiation provides a means to 
determine its velocity, as in: 


vi) =F yO 


As in Fig. PTS. 1a, the derivative can be visualized as the slope of a function. 

Integration is the inverse of differentiation. Just as dif- 
ferentiation uses differences to quantify an instantaneous 
process, integration involves summing instantaneous infor- 
mation to give a total result over an interval. Thus, if we are 
provided with velocity as a function of time, integration 
can be used to determine the distance traveled: 


vo f v(t) dt 


As in Fig. PT5.1b, for functions lying above the abscissa, 
the integral can be visualized as the area under the curve 
of v(t) from 0 to t. Consequently, just as a derivative can 
be thought of as a slope, an integral can be envisaged as 
a summation. 

Because of the close relationship between differen- 
tiation and integration, we have opted to devote this part 
of the book to both processes. Among other things, this 
will provide the opportunity to highlight their similarities 
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5.2 


200 


FIGURE PT5.1 
The contrast between (a) differentiation and (b) integration. 


and differences from a numerical perspective. In addition, the material will have relevance 
to the next part of the book where we will cover differential equations. 

Although differentiation is taught before integration in calculus, we reverse their order 
in the following chapters. We do this for several reasons. First, we have already introduced 
you to the basics of numerical differentiation in Chap. 4. Second, in part because it is much 
less sensitive to roundoff errors, integration represents a more highly developed area of 
numerical methods. Finally, although numerical differentiation is not as widely employed, 
it does have great significance for the solution of differential equations. Hence, it makes 
sense to cover it as the last topic prior to describing differential equations in Part Six. 


PART ORGANIZATION 


Chapter 19 is devoted to the most common approaches for numerical integration—the 
Newton-Cotes formulas. These relationships are based on replacing a complicated func- 
tion or tabulated data with a simple polynomial that is easy to integrate. Three of the 
most widely used Newton-Cotes formulas are discussed in detail: the trapezoidal rule, 
Simpson’s 1/3 rule, and Simpson’s 3/8 rule. All these formulas are designed for cases 
where the data to be integrated are evenly spaced. In addition, we also include a discussion 
of numerical integration of unequally spaced data. This is a very important topic because 
many real-world applications deal with data that are in this form. 
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All the above material relates to closed integration, where the function values at the 
ends of the limits of integration are known. At the end of Chap. 19, we present open inte- 
gration formulas, where the integration limits extend beyond the range of the known data. 
Although they are not commonly used for definite integration, open integration formulas 
are presented here because they are utilized in the solution of ordinary differential equa- 
tions in Part Six. 

The formulations covered in Chap. 19 can be employed to analyze both tabulated 
data and equations. Chapter 20 deals with two techniques that are expressly designed to 
integrate equations and functions: Romberg integration and Gauss quadrature. Computer 
algorithms are provided for both of these methods. In addition, adaptive integration is 
discussed. 

In Chap. 21, we present additional information on numerical differentiation to 
supplement the introductory material from Chap. 4. Topics include high-accuracy finite- 
difference formulas, Richardson extrapolation, and the differentiation of unequally spaced 
data. The effect of errors on both numerical differentiation and integration is also discussed. 


Numerical Integration 
Formulas 


CHAPTER OBJECTIVES 


The primary objective of this chapter is to introduce you to numerical integration. 
Specific objectives and topics covered are 


e Recognizing that Newton-Cotes integration formulas are based on the strategy of 
replacing a complicated function or tabulated data with a polynomial that is easy 
to integrate. 

Knowing how to implement the following single application Newton-Cotes 
formulas: 
Trapezoidal rule 
Simpson’s 1/3 rule 
Simpson’s 3/8 rule 
Knowing how to implement the following composite Newton-Cotes formulas: 
Trapezoidal rule 
Simpson’s 1/3 rule 
Recognizing that even-segment—odd-point formulas like Simpson’s 1/3 rule 
achieve higher than expected accuracy. 
Knowing how to use the trapezoidal rule to integrate unequally spaced data. 
Understanding the difference between open and closed integration formulas. 


YOU’VE GOT A PROBLEM 


ecall that the velocity of a free-falling bungee jumper as a function of time can be 
computed as 


v(t) = 4/2 tanh (Sas (19.1) 
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19.1 


Suppose that we would like to know the vertical distance z the jumper has fallen after a 
certain time ¢. This distance can be evaluated by integration: 


z(t) z v (t) dt (19.2) 


Substituting Eq. (19.1) into Eq. (19.2) gives 


T VE tanh E dt (19.3) 


Thus, integration provides the means to determine the distance from the velocity. Calculus 
can be used to solve Eq. (19.3) for 


2@=2n eos| :,) (19.4) 


Although a closed form solution can be developed for this case, there are other func- 
tions that cannot be integrated analytically. Further, suppose that there was some way to 
measure the jumper’s velocity at various times during the fall. These velocities along with 
their associated times could be assembled as a table of discrete values. In this situation, it 
would also be possible to integrate the discrete data to determine the distance. In both these 
instances, numerical integration methods are available to obtain solutions. Chapters 19 and 
20 will introduce you to some of these methods. 


INTRODUCTION AND BACKGROUND 


19.1.1 What Is Integration? 


According to the dictionary definition, to integrate means “to bring together, as parts, into 


a whole; to unite; to indicate the total amount. . . .” Mathematically, definite integration is 
represented by 
b 
T= f f(x)dx (19.5) 


which stands for the integral of the function f(x) with respect to the independent variable x, 
evaluated between the limits x = a to x = b. 

As suggested by the dictionary definition, the “meaning” of Eq. (19.5) is the total 
value, or summation, of f(x) dx over the range x = a to b. In fact, the symbol f is actually 
a stylized capital S that is intended to signify the close connection between integration and 
summation. 

Figure 19.1 represents a graphical manifestation of the concept. For functions lying 
above the x axis, the integral expressed by Eq. (19.5) corresponds to the area under the 
curve of f(x) between x = a and b. 

Numerical integration is sometimes referred to as quadrature. This is an archaic term 
that originally meant the construction of a square having the same area as some curvilinear 
figure. Today, the term quadrature is generally taken to be synonymous with numerical 
definite integration. 
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F(x) 4 


FIGURE 19.1 
Graphical representation of the integral of f (x) between the limits x = a to b. The integral is 
equivalent to the area under the curve. 


19.1.2 Integration in Engineering and Science 


Integration has so many engineering and scientific applications that you were required to 
take integral calculus in your first year at college. Many specific examples of such applica- 
tions could be given in all fields of engineering and science. A number of examples relate 
directly to the idea of the integral as the area under a curve. Figure 19.2 depicts a few cases 
where integration is used for this purpose. 

Other common applications relate to the analogy between integration and summation. 
For example, a common application is to determine the mean of a continuous function. 
Recall that the mean of discrete data points can be calculated by [Eq. (14.2)]. 


(19.6) 


where y; are individual measurements. The determination of the mean of discrete points is 
depicted in Fig. 19.3a. 

In contrast, suppose that y is a continuous function of an independent variable x, as 
depicted in Fig. 19.3b. For this case, there are an infinite number of values between a and b. 
Just as Eq. (19.6) can be applied to determine the mean of the discrete readings, you might 
also be interested in computing the mean or average of the continuous function y = f (x) for 
the interval from a to b. Integration is used for this purpose, as specified by 

b 
Mean = fa Foods (19.7) 
b-a 
This formula has hundreds of engineering and scientific applications. For example, it is 
used to calculate the center of gravity of irregular objects in mechanical and civil engineer- 
ing and to determine the root-mean-square current in electrical engineering. 
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Ail 


(6) (c) 


FIGURE 19.2 

Examples of how integration is used to evaluate areas in engineering and scientific applications. (a) A surveyor 
might need to know the area of a field bounded by a meandering stream and two roads. (b) A hydrologist might 
need to know the cross-sectional area of a river. (c) A structural engineer might need to determine the net force 
due to a nonuniform wind blowing against the side of a skyscraper. 
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FIGURE 19.3 


An illustration of the mean for (a) discrete and (b) continuous data. 
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Integrals are also employed by engineers and scientists to evaluate the total amount or 
quantity of a given physical variable. The integral may be evaluated over a line, an area, 
or a volume. For example, the total mass of chemical contained in a reactor is given as the 
product of the concentration of chemical and the reactor volume, or 


Mass = concentration Xx volume 


where concentration has units of mass per volume. However, suppose that concentration 
varies from location to location within the reactor. In this case, it is necessary to sum the 
products of local concentrations c; and corresponding elemental volumes AV;: 


n 
Mass = )ic, AV, 
i=l 
where n is the number of discrete volumes. For the continuous case, where c(x, y, z) is a 
known function and x, y, and z are independent variables designating position in Cartesian 
coordinates, integration can be used for the same purpose: 


Mass = MEE y, z) dx dy dz 


or 


Mass = [f [vy av 
V 


which is referred to as a volume integral. Notice the strong analogy between summation 
and integration. 

Similar examples could be given in other fields of engineering and science. For ex- 
ample, the total rate of energy transfer across a plane where the flux (in calories per square 
centimeter per second) is a function of position is given by 


Flux = f fflux dA 
A 


which is referred to as an areal integral, where A = area. 

These are just a few of the applications of integration that you might face regularly 
in the pursuit of your profession. When the functions to be analyzed are simple, you will 
normally choose to evaluate them analytically. However, it is often difficult or impossible 
when the function is complicated, as is typically the case in more realistic examples. In 
addition, the underlying function is often unknown and defined only by measurement at 
discrete points. For both these cases, you must have the ability to obtain approximate val- 
ues for integrals using numerical techniques as described next. 


NEWTON-COTES FORMULAS 


The Newton-Cotes formulas are the most common numerical integration schemes. They 
are based on the strategy of replacing a complicated function or tabulated data with a poly- 
nomial that is easy to integrate: 


b b 
1=f fodas f t,x) dx (19.8) 
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where f(x) = a polynomial of the form 

F(X) = dy taxt: +a, x"! H a,x" (19.9) 
where n is the order of the polynomial. For example, in Fig. 19.4, a first-order polynomial 
(a straight line) is used as an approximation. In Fig. 19.4b, a parabola is employed for the 
same purpose. 

The integral can also be approximated using a series of polynomials applied piecewise 
to the function or data over segments of constant length. For example, in Fig. 19.5, three 


FIGURE 19.4 
The approximation of an integral by the area under (a) a straight line and (b) a parabola. 


F(x) + F(x) 4 


o AAA 
BY 

Ray PBS 
mY 


FIGURE 19.5 
The approximation of an integral by the area under three straight-line segments. 


F(x) + 
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19.3 


F(x) 4 F(x) + 
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i F a b i 


(a) (b) 


FIGURE 19.6 
The difference between (a) closed and (b) open integration formulas. 


straight-line segments are used to approximate the integral. Higher-order polynomials can 
be utilized for the same purpose. 

Closed and open forms of the Newton-Cotes formulas are available. The closed forms 
are those where the data points at the beginning and end of the limits of integration are 
known (Fig. 19.6a). The open forms have integration limits that extend beyond the range 
of the data (Fig. 19.6b). This chapter emphasizes the closed forms. However, material on 
open Newton-Cotes formulas is briefly introduced in Sec. 19.7. 


THE TRAPEZOIDAL RULE 


The trapezoidal rule is the first of the Newton-Cotes closed integration formulas. It cor- 
responds to the case where the polynomial in Eq. (19.8) is first-order: 


b 
t= f [ro a-a] a (19.10) 


The result of the integration is 


=@-a FOF (19.11) 
which is called the trapezoidal rule. 

Geometrically, the trapezoidal rule is equivalent to approximating the area of the trap- 
ezoid under the straight line connecting f(a) and f(b) in Fig. 19.7. Recall from geometry 
that the formula for computing the area of a trapezoid is the height times the average of the 
bases. In our case, the concept is the same but the trapezoid is on its side. Therefore, the 
integral estimate can be represented as 


I = width x average height (19.12) 
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EXAMPLE 19.1 


F(x) + 


f(b) 


f(a) 


FIGURE 19.7 
Graphical depiction of the trapezoidal rule. 


or 
I= (b — a) x average height (19.13) 


where, for the trapezoidal rule, the average height is the average of the function values at 
the end points, or [ f(a) + f(b)]/2. 

All the Newton-Cotes closed formulas can be expressed in the general format of 
Eq. (19.13). That is, they differ only with respect to the formulation of the average height. 


19.3.1 Error of the Trapezoidal Rule 


When we employ the integral under a straight-line segment to approximate the integral 
under a curve, we obviously can incur an error that may be substantial (Fig. 19.8). An esti- 
mate for the local truncation error of a single application of the trapezoidal rule is 


E,= -1 "(Eb — a)? (19.14) 


where é lies somewhere in the interval from a to b. Equation (19.14) indicates that if the 
function being integrated is linear, the trapezoidal rule will be exact because the second 
derivative of a straight line is zero. Otherwise, for functions with second- and higher-order 
derivatives (i.e., with curvature), some error can occur. 


Single Application of the Trapezoidal Rule 
Problem Statement. Use Eq. (19.11) to numerically integrate 
f(x) = 0.2 + 25x — 200x* + 675x° — 900x* + 400x° 


from a =0 to b=0.8. Note that the exact value of the integral can be determined analytically 
to be 1.640533. 
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FIGURE 19.8 
Graphical depiction of the use of a single application of the trapezoidal rule to approximate 
the integral of f(x) = 0.2 + 25x — 200x? + 675x3 — 900x* + 400x° from x = O to 0.8. 


Solution. The function values f(0) = 0.2 and f(0.8) = 0.232 can be substituted into 
Eq. (19.11) to yield 


1=(0.8—0) tt tah = 0.1728 


which represents an error of E, = 1.640533 — 0.1728 = 1.467733, which corresponds to 
a percent relative error of e, = 89.5%. The reason for this large error is evident from the 
graphical depiction in Fig. 19.8. Notice that the area under the straight line neglects a sig- 
nificant portion of the integral lying above the line. 

In actual situations, we would have no foreknowledge of the true value. Therefore, 
an approximate error estimate is required. To obtain this estimate, the function’s second 
derivative over the interval can be computed by differentiating the original function twice 
to give 


f(x) = —400 + 4,050x — 10,800x? + 8,000x? 
The average value of the second derivative can be computed as [Eq. (19.7)] 


- JE? (—400 + 4,050x — 10,800x? + 8,000x°) dx 
OE eT = -60 


which can be substituted into Eq. (19.14) to yield 


a 


2 sh fy, 3 
E-= 12 í 60)(0.8)° = 2.56 
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which is of the same order of magnitude and sign as the true error. A discrepancy does 
exist, however, because of the fact that for an interval of this size, the average second 
derivative is not necessarily an accurate approximation of f”(&). Thus, we denote that the 
error is approximate by using the notation E, rather than exact by using £, 


19.3.2 The Composite Trapezoidal Rule 


One way to improve the accuracy of the trapezoidal rule is to divide the integration interval 
from a to b into a number of segments and apply the method to each segment (Fig. 19.9). 
The areas of individual segments can then be added to yield the integral for the entire 
interval. The resulting equations are called composite, or multiple-segment, integration 


formulas. 
Figure 19.9 shows the general format and nomenclature we will use to characterize 
composite integrals. There are n + 1 equally spaced base points (xo, X1, Xz, .. . , X,,). Conse- 


quently, there are n segments of equal width: 


_b-a 
h=— (19.15) 


If a and b are designated as xp, and x 
sented as 


respectively, the total integral can be repre- 


n?’ 


T= [50 dx + [40 axe + ft dx 


FIGURE 19.9 
Composite trapezoidal rule. 
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EXAMPLE 19.2 


Substituting the trapezoidal rule for each integral yields 


j= pfe +f Œ) + plow +f) Heng pfe + f@,) (19.16) 
2 2 2 
or, grouping terms: 
h n-l 
T= | f@) +2 2 So; ) +f) (19.17) 
= 
or, using Eq. (19.15) to express Eq. (19.17) in the general form of Eq. (19.13): 
n=l 
F(X) + 2 È FG) + f (Xn) 
[=(-4 = 
= 2n (19.18) 


Width = ————~———_ 
Average height 


Because the summation of the coefficients of f(x) in the numerator divided by 2n is equal 
to 1, the average height represents a weighted average of the function values. According 
to Eq. (19.18), the interior points are given twice the weight of the two end points f(x) 
and f(x,,). 

An error for the composite trapezoidal rule can be obtained by summing the individual 
errors for each segment to give 


(b- ay 
E=- 3 
12n i 


2f C) (19.19) 


where f”(€;) is the second derivative at a point €; located in segment i. This result can be 
simplified by estimating the mean or average value of the second derivative for the entire 
interval as 


o DPE 
j" = i=l 7 (19.20) 


Therefore $, f"(é,) =n f" and Eq. (19.19) can be rewritten as 


E, = _(b 7 a)’ pl 
12n? 


(19.21) 


Thus, if the number of segments is doubled, the truncation error will be quartered. Note 
that Eq. (19.21) is an approximate error because of the approximate nature of Eq. (19.20). 


Composite Application of the Trapezoidal Rule 
Problem Statement. Use the two-segment trapezoidal rule to estimate the integral of 
f(x) = 0.2 + 25x — 200x? + 675x° — 900x* + 400x° 


from a = 0 to b = 0.8. Employ Eq. (19.21) to estimate the error. Recall that the exact value 
of the integral is 1.640533. 
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Solution. For n =2 (h = 0.4): 
f@) = 0.2 f(0.4) = 2.456 f(0.8) = 0.232 


0.2 + 2(2.456) + 0.232 
4 


E, = 1.640533 — 1.0688 = 0.57173 €, = 34.9% 


T=0.8 = 1.0688 


0.8° 
E,=- —60) = 0.64 
a 122 ) 


where —60 is the average second derivative determined previously in Example 19.1. 


The results of the previous example, along with three- through ten-segment 
applications of the trapezoidal rule, are summarized in Table 19.1. Notice how the 
error decreases as the number of segments increases. However, also notice that the rate 
of decrease is gradual. This is because the error is inversely related to the square of n 
[Eq. (19.21)]. Therefore, doubling the number of segments quarters the error. In subse- 
quent sections we develop higher-order formulas that are more accurate and that con- 
verge more quickly on the true integral as the segments are increased. However, before 
investigating these formulas, we will first discuss how MATLAB can be used to imple- 
ment the trapezoidal rule. 


19.3.3 MATLAB M-file: trap 


A simple algorithm to implement the composite trapezoidal rule can be written as in 
Fig. 19.10. The function to be integrated is passed into the M-file along with the limits of 
integration and the number of segments. A loop is then employed to generate the integral 
following Eq. (19.18). 


TABLE 19.1 Results for the composite trapezoidal rule to 
estimate the integral of f(x) = 0.2 + 25x — 
200x? + 675x — 900x* + 400x from 
x = 0 to 0.8. The exact value is 1.640533. 


n h I E, (%) 
2 0.4 .0688 34.9 
3 0.2667 .3695 16.5 
4 0.2 4848 9.5 
5 0.16 1.5399 61 
6 0.1333 1.5703 4.3 
7 0.1143 5887 32 
8 0.1 .6008 2.4 
9 0.0889 .6091 1.9 
10 0.08 1.6150 136 
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function I = trap(func,a,b,n,varargin) 
% trap: composite trapezoidal rule quadrature 
I = trap(func,a,b,n,pl,p2,...): 
composite trapezoidal rule 
input: 
func = name of function to be integrated 


a, b = integration limits 

n = number of segments (default = 100) 

pl,p2,... = additional parameters used by func 
output: 


DL SL ƏL L L L L L L 


I = integral estimate 


if nargin<3,error('at least 3 input arguments required'),end 
if ~(b>a),error('upper bound must be greater than lower'),end 
if nargin<4|isempty(n),n=100;end 

x =a; h= (b = a)/n; 

s=func(a,varargin{:}); 

formi = Ien T 

X 

s + 2*func(x,varargin{:}); 


wn 


+ func(b,varargin{:}); 
(dy = a) * SH(24n))e 


FIGURE 19.10 
M-file to implement the composite trapezoidal rule. 


An application of the M-file can be developed to determine the distance fallen by 
the free-falling bungee jumper in the first 3 s by evaluating the integral of Eq. (19.3). For 
this example, assume the following parameter values: g = 9.81 m/s*, m = 68.1 kg, and 
c,= 0.25 kg/m. Note that the exact value of the integral can be computed with Eq. (19.4) 
as 41.94805. 

The function to be integrated can be developed as an M-file or with an anonymous 
function, 


>> v=@(t) sqrt(9.81*68.1/0.25)*tanh(sqrt(9.81*0.25/68.1)*t) 


V = 
@(t) sqrt(9.81*68.1/0.25)*tanh(sqrt(9.81*0.25/68.1)*t) 


First, let’s evaluate the integral with a crude five-segment approximation: 


>> format long 
>> trap(v,0,3,5) 


ans = 
41 .86992959072735 
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As would be expected, this result has a relatively high true error of 18.6%. To obtain a more 
accurate result, we can use a very fine approximation based on 10,000 segments: 
>> trap(v,0,3,10000) 
X = 
41 . 94804999917528 


which is very close to the true value. 


SIMPSON’S RULES 


Aside from applying the trapezoidal rule with finer segmentation, another way to obtain 
a more accurate estimate of an integral is to use higher-order polynomials to connect the 
points. For example, if there is an extra point midway between f(a) and f(b), the three 
points can be connected with a parabola (Fig. 19.11a). If there are two points equally 
spaced between f(a) and f(b), the four points can be connected with a third-order poly- 
nomial (Fig. 19.115). The formulas that result from taking the integrals under these 
polynomials are called Simpson’s rules. 


19.4.1 Simpson’s 1/3 Rule 


Simpson’s 1/3 rule corresponds to the case where the polynomial in Eq. (19.8) is second- 
order: 


=f |; 
(xX — x(x — x) 


+——_—~ f (xp) dx 


(y= Xp) — x1) 


-xx — X2) 


a — X1)(X% — X2) 


— XX —- 
= Xx — x: z 


F(x 1) 


oT £0) + = 


FIGURE 19.11 

(a) Graphical depiction of Simpson’s 1/3 rule: It consists of taking the area under a parabola 
connecting three points. (b) Graphical depiction of Simpson’s 3/8 rule: It consists of taking 
the area under a cubic equation connecting four points. 
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EXAMPLE 19.3 


where a and b are designated as x, and x5, respectively. The result of the integration is 


1= 3 [fao +4 fE) Fa] Gee) 


where, for this case, h = (b — a)/2. This equation is known as Simpson’s 1/3 rule. The 
label “1/3” stems from the fact that h is divided by 3 in Eq. (19.22). Simpson’s 1/3 rule 
can also be expressed using the format of Eq. (19.13): 


T=(b- a Lev L iro +f) 


(19.23) 


where a = xo b = x5, and x, = the point midway between a and b, which is given by 
(a + b)/2. Notice that, according to Eq. (19.23), the middle point is weighted by two- 
thirds and the two end points by one-sixth. 

It can be shown that a single-segment application of Simpson’s 1/3 rule has a trunca- 
tion error of 


—_ 1 75 6a 


or, because h = (b — a)/2: 


__ (b= ay 4) 

E,= maT f° (£) (19.24) 
where & lies somewhere in the interval from a to b. Thus, Simpson’s 1/3 rule is more ac- 
curate than the trapezoidal rule. However, comparison with Eq. (19.14) indicates that it is 
more accurate than expected. Rather than being proportional to the third derivative, the 
error is proportional to the fourth derivative. Consequently, Simpson’s 1/3 rule is third- 
order accurate even though it is based on only three points. In other words, it yields exact 
results for cubic polynomials even though it is derived from a parabola! 


Single Application of Simpson’s 1/3 Rule 
Problem Statement. Use Eq. (19.23) to integrate 
f(x) = 0.2 + 25x — 200x? + 675x? — 900x* + 400x° 
from a = 0 to b= 0.8. Employ Eq. (19.24) to estimate the error. Recall that the exact inte- 


gral is 1.640533. 


Solution. n = 2(h = 0.4): 
f(O) = 0.2 (0.4) = 2.456 f(0.8) = 0.232 


0.2 + 4(2.456) + 0.232 


T=0. 
0.8 6 


= 1.367467 


E, = 1.640533 — 1.367467 = 0.2730667 £, = 16.6% 
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which is approximately five times more accurate than for a single application of the trap- 
ezoidal rule (Example 19.1). The approximate error can be estimated as 


E =— 2 (_2400) = 0.2730667 
a = — 9880 


where —2400 is the average fourth derivative for the interval. As was the case in 
Example 19.1, the error is approximate (E,) because the average fourth derivative is 
generally not an exact estimate of f(é). However, because this case deals with a fifth- 
order polynomial, the result matches exactly. 


19.4.2 The Composite Simpson’s 1/3 Rule 


Just as with the trapezoidal rule, Simpson’s rule can be improved by dividing the integra- 
tion interval into a number of segments of equal width (Fig. 19.12). The total integral can 
be represented as 


i= [ soa f fadet t [fax (19.25) 


FIGURE 19.12 
Composite Simpson’s 1/3 rule. The relative weights are depicted above the function values. 
Note that the method can be employed only if the number of segments is even. 
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EXAMPLE 19.4 


Substituting Simpson’s 1/3 rule for each integral yields 


FO) + AFC) + FC) yp fOD + 4 Fs) + $4) 
6 6 


IT=2h 


+.: 


spd 2h fana) + 4 f(x,-1) + f (x,) 
6 
or, grouping terms and using Eq. (19.15): 


n—1 n—2 
fe)t+4 LY fe)+2 È fafa) 
j=1,3,5 J=2,4,6 


l 


I=(b-a) (19.26) 


3n 

Notice that, as illustrated in Fig. 19.12, an even number of segments must be utilized 
to implement the method. In addition, the coefficients “4” and “2” in Eq. (19.26) might 
seem peculiar at first glance. However, they follow naturally from Simpson’s 1/3 rule. As 
illustrated in Fig. 19.12, the odd points represent the middle term for each application and 
hence carry the weight of four from Eq. (19.23). The even points are common to adjacent 
applications and hence are counted twice. 

An error estimate for the composite Simpson’s rule is obtained in the same fashion as 
for the trapezoidal rule by summing the individual errors for the segments and averaging 
the derivative to yield 


ES _(b -a 7 (4) 
180n*4 


(19.27) 


where f® is the average fourth derivative for the interval. 


Composite Simpson’s 1/3 Rule 
Problem Statement. Use Eq. (19.26) with n = 4 to estimate the integral of 
f(x) = 0.2 + 25x — 200x? + 675x* — 900x* + 400x° 
from a = 0 to b = 0.8. Employ Eq. (19.27) to estimate the error. Recall that the exact 
integral is 1.640533. 


Solution. n = 4(h = 0.2): 


AO =0.2 f (0.2) = 1.288 
f(0.4)=2.456  f(0.6) = 3.464 
f(0.8) = 0.232 


From Eq. (19.26): 


0.2 + 4(1.288 + 3.464) + 2(2.456) + 0.232 
12 


E, = 1.640533 — 1.623467 = 0.017067 €,= 1.04% 


T=0.8 = 1.623467 
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The estimated error [Eq. (19.27)] is 


5 
g= LS (_400) = 0.017067 


“ 180(4)* 


which is exact (as was also the case for Example 19.3). 


As in Example 19.4, the composite version of Simpson’s 1/3 rule is considered supe- 
rior to the trapezoidal rule for most applications. However, as mentioned previously, it is 
limited to cases where the values are equispaced. Further, it is limited to situations where 
there are an even number of segments and an odd number of points. Consequently, as dis- 
cussed in Sec. 19.4.3, an odd-segment—even-point formula known as Simpson’s 3/8 rule 
can be used in conjunction with the 1/3 rule to permit evaluation of both even and odd 
numbers of equispaced segments. 


19.4.3 Simpson’s 3/8 Rule 


In a similar manner to the derivation of the trapezoidal and Simpson’s 1/3 rule, a third- 
order Lagrange polynomial can be fit to four points and integrated to yield 


1= Sh (f(xy) + 37) + 3f e) + F009) 


where h = (b — a)/3. This equation is known as Simpsons 3/8 rule because h is multiplied 
by 3/8. It is the third Newton-Cotes closed integration formula. The 3/8 rule can also be 
expressed in the form of Eq. (19.13): 


feo) + 3f(&,) + 3f a) + fs) 
8 


Thus, the two interior points are given weights of three-eighths, whereas the end points are 
weighted with one-eighth. Simpson’s 3/8 rule has an error of 


Il=(b-a 


(19.28) 


—_3 75 44) 
or, because h = (b — a)/3: 


_ b-a y 


Because the denominator of Eq. (19.29) is larger than for Eq. (19.24), the 3/8 rule is some- 
what more accurate than the 1/3 rule. 

Simpson’s 1/3 rule is usually the method of preference because it attains third-order 
accuracy with three points rather than the four points required for the 3/8 version. How- 
ever, the 3/8 rule has utility when the number of segments is odd. For instance, in Example 
19.4 we used Simpson’s rule to integrate the function for four segments. Suppose that you 
desired an estimate for five segments. One option would be to use a composite version of 
the trapezoidal rule as was done in Example 19.2. This may not be advisable, however, 
because of the large truncation error associated with this method. An alternative would be 
to apply Simpson’s 1/3 rule to the first two segments and Simpson’s 3/8 rule to the last 
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FIGURE 19.13 


Illustration of how Simpson’s 1/3 and 3/8 rules can be applied in tandem to handle multiple 
applications with odd numbers of intervals. 


three (Fig. 19.13). In this way, we could obtain an estimate with third-order accuracy across 
the entire interval. 


EXAMPLE 19.5 Simpson’s 3/8 Rule 


Problem Statement. (a) Use Simpson’s 3/8 rule to integrate 


f(x) = 0.2 + 25x — 200x? + 675x° — 900x* + 400x° 


from a = 0 to b = 0.8. (b) Use it in conjunction with Simpson’s 1/3 rule to integrate the 
same function for five segments. 


Solution. (a) A single application of Simpson’s 3/8 rule requires four equally spaced 
points: 


f(O) =0.2 f(0.2667) = 1.432724 
f(0.5333) = 3.487177 f(0.8) = 0.232 
Using Eq. (19.28): 


1=08 0.2 + 3(1.432724 EAEN, + 0.232 -1.51917 
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(b) The data needed for a five-segment application (A = 0.16) are 
fO) =0.2 f(0.16) = 1.296919 
f(0.32) = 1.743393 f(0.48) = 3.186015 
f(0.64) = 3.181929 f(0.80) = 0.232 
The integral for the first two segments is obtained using Simpson’s 1/3 rule: 


0.2 + 4(1.296919) + 1.743393 
6 


For the last three segments, the 3/8 rule can be used to obtain 


1.743393 + 3(3.186015 + 3.181929) + 0.232 
8 


The total integral is computed by summing the two results: 


I= 0.32 = 0.3803237 


1=0.48 = 1.264754 


I = 0.3803237 + 1.264754 = 1.645077 


19.5 


HIGHER-ORDER NEWTON-COTES FORMULAS 


As noted previously, the trapezoidal rule and both of Simpson’s rules are members of 
a family of integrating equations known as the Newton-Cotes closed integration formu- 
las. Some of the formulas are summarized in Table 19.2 along with their truncation-error 
estimates. 

Notice that, as was the case with Simpson’s 1/3 and 3/8 rules, the five- and six-point 
formulas have the same order error. This general characteristic holds for the higher-point 
formulas and leads to the result that the even-segment—odd-point formulas (e.g., 1/3 rule 
and Boole’s rule) are usually the methods of preference. 


TABLE 19.2 Newton-Cotes closed integration formulas. The formulas are presented in the format of Eq. (19.13) 
so that the weighting of the data points to estimate the average height is apparent. The step size is 
given by h = (b — a)/n. 


Segments 
(n) Points Name Formula Truncation Error 
1 2 Trapezoidal rule (b-a) feo 41) —(1/12)h? fF" 
(x) + AFC . 
2 3 Simpson’s 13 rule (b— ay PELEA tfo = (1/90) f (2) 


4 Simpson’s 3/8 rule (b 


5 Boole’s rule (b 


_ a) eo + 3f(x,) fe fey) +f) ~(3/80) Wf (E) 
7 f (Xp) + 32 f(X,) + 12 fœ) + 32 fœ) + 7 fy) 
74) 50 
19 f (x0) + 75 f(x) + 50 f(xy) + 50 f(x3) + 75 f(xy) + 19 f(x) 
a) 288 


—(8/945) W f CE) 


(b- 


—(275/12,096)h" f'E) 
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19.6 


EXAMPLE 19.6 


However, it must also be stressed that, in engineering and science practice, the higher- 
order (i.e., greater than four-point) formulas are not commonly used. Simpson’s rules are 
sufficient for most applications. Accuracy can be improved by using the composite ver- 
sion. Furthermore, when the function is known and high accuracy is required, methods 
such as Romberg integration or Gauss quadrature, described in Chap. 20, offer viable and 
attractive alternatives. 


INTEGRATION WITH UNEQUAL SEGMENTS 


To this point, all formulas for numerical integration have been based on equispaced data 
points. In practice, there are many situations where this assumption does not hold and 
we must deal with unequal-sized segments. For example, experimentally derived data are 
often of this type. For these cases, one method is to apply the trapezoidal rule to each seg- 
ment and sum the results: 


I 


a a +n Eo he, (19.30) 


} fn) +f (x,) 
Ty a an 


where h, = the width of segment i. Note that this was the same approach used for the com- 
posite trapezoidal rule. The only difference between Eqs. (19.16) and (19.30) is that the h’s 
in the former are constant. 


Trapezoidal Rule with Unequal Segments 


Problem Statement. The information in Table 19.3 was generated using the same poly- 
nomial employed in Example 19.1. Use Eq. (19.30) to determine the integral for these data. 
Recall that the correct answer is 1.640533. 


TABLE 19.3 Data for f(x) = 0.2 + 25x — 200x? + 675x — 900x* + 4002°, with 
unequally spaced values of x. 


x fŒ x f(x) 
0.00 0.200000 0.44 2.842985 
0.12 1.309729 0.54 3.507297 
0.22 1.305241 0.64 3.181929 
0.32 1.743393 0.70 2.363000 
0.36 2.074903 0.80 0.232000 
0.40 2.456000 


Solution. Applying Eq. (19.30) yields 


1=0.12 Cot 1309729 + 0.10 1309729 i 1.305241 


+--+- +0.10 


2.303 t 0.232 _ 1 594801 


which represents an absolute percent relative error of £, = 2.8%. 
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19.6.1 MATLAB M-file: trapuneq 


A simple algorithm to implement the trapezoidal rule for unequally spaced data can be 
written as in Fig. 19.14. Two vectors, x and y, holding the independent and dependent 
variables are passed into the M-file. Two error traps are included to ensure that (a) the two 
vectors are of the same length and (b) the x’s are in ascending order.' A loop is employed to 
generate the integral. Notice that we have modified the subscripts from those of Eq. (19.30) 
to account for the fact that MATLAB does not allow zero subscripts in arrays. 

An application of the M-file can be developed for the same problem that was solved 
in Example 19.6: 


>> x = [0 .12 .22 .32 .36 .4 .44 .54 .64 .7 .8]; 
>> y = 0.2+25*x-200*x.42+675*x.43-900*x.*4+400*x.45; 
>> trapuneq(x,y) 


ans = 


1.5948 


which is identical to the result obtained in Example 19.6. 


FIGURE 19.14 
M-file to implement the trapezoidal rule for unequally spaced data. 


function I = trapuneq(x,y) 

% trapuneq: unequal spaced trapezoidal rule quadrature 

% I = trapuneq(x,y): 

% Applies the trapezoidal rule to determine the integral 
% for n data points (x, y) where x and y must be of the 
% same length and x must be monotonically ascending 

% input: 

% x= vector of independent variables 

% y = vector of dependent variables 

% output: 

% I= integral estimate 


if nargin<2,error('at least 2 input arguments required'),end 
if any(diff(x)<0),error('x not monotonically ascending' ),end 
n = length(x); 

if length(y)~=n,error('x and y must be same length'); end 


s = 0; 
for k = 1:n-1 
Ss = s + (x(k+1)—x(k))*(y(k)+y(k#1))/2; 
end 
I=s; 


' The diff function is described in Sec. 21.7.1. 
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EXAMPLE 19.7 


19.6.2 MATLAB Functions: trapz and cumtrapz 


MATLAB has a built-in function that evaluates integrals for data in the same fashion as the 
M-file we just presented in Fig. 19.14. It has the general syntax 


= trapz(x, y) 


where the two vectors, x and y, hold the independent and dependent variables, respec- 
tively. Here is a simple MATLAB session that uses this function to integrate the data from 
Table 19.3: 

>> x = [0 .12 .22 .32 .36 .4 .44 .54 64 .7 .8]; 

>> y = 0.2+25*x-200*x.42+675*x.*3-900*x.%4+400*x.45; 

>> trapz(x,y) 


ans = 
1.5948 


In addition, MATLAB has another function, cumtrapz, that computes the cumulative 
integral. A simple representation of its syntax is 


z = cumtrapz(x, y) 


where the two vectors, x and y, hold the independent and dependent variables, respectively, 
and z = a vector whose elements z(k) hold the integral from x(1) to x(k). 


Using Numerical Integration to Compute Distance from Velocity 


Problem Statement. As described at the beginning of this chapter, a nice application 
of integration is to compute the distance z(¢) of an object based on its velocity v (f) as in 
[recall Eq. (19.2)]: 


z(t) = I v(t) dt 


Suppose that we had measurements of velocity at a series of discrete unequally spaced 
times during free fall. Use Eq. (19.2) to synthetically generate such information for a 
70-kg jumper with a drag coefficient of 0.275 kg/m. Incorporate some random error 
by rounding the velocities to the nearest integer. Then use cumtrapz to determine the 
distance fallen and compare the results to the analytical solution [Eq. (19.4)]. In addi- 
tion, develop a plot of the analytical and computed distances along with velocity on the 
same graph. 


Solution. Some unequally spaced times and rounded velocities can be generated as 


>> format short g 

>> t=[0 1 1.4 2 3 4.3 6 6.7 8]; 

>> g=9.81;m=70;cd=0.275; 

>> v=round(sqrt(g*m/cd)*tanh(sqrt(g*cd/m)*t)); 


The distances can then be computed as 
>> z=cumtrapz(t,v) 


Z= 
0 5 9.6 19.2 41.7 80.7 144.45 173.85 231.7 
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Thus, after 8 seconds, the jumper has fallen 231.7 m. This result is reasonably close to the 
analytical solution [Eq. (19.4)]: 


70 /9.81(0.275) J| 
275 In cosh ( ~o 8}| = 234.1 


0. 


z(t) = 


A graph of the numerical and analytical solutions along with both the exact and rounded 
velocities can be generated with the following commands: 


>> ta=linspace(t(1),t(length(t))); 

>> za=m/cd*1log(cosh(sqrt(g*cd/m)*ta) ) ; 
>> plot(ta,za,t,z,'o') 

>> title('Distance versus time' ) 

>> xlabel('t (s)'),ylabel('x (m)') 

>> legend('analytical', 'numerical') 


As in Fig. 19.15, the numerical and analytical results match fairly well. 


FIGURE 19.15 
Plot of distance versus time. The line was computed with the analytical solution, whereas the 


points were determined numerically with the cumtrapz function. 
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TABLE 19.4 Newton-Cotes open integration formulas. The formulas are presented in the format of Eq. (19.13) so 
that the weighting of the data points to estimate the average height is apparent. The step size is 
given by h = (b —a)/n. 


Segments 
(n) Points Name Formula Truncation Error 
2 1 Midpoint method (b — a) f(x) 1/3) Wè F"(E) 
3 2 (b= @) Lt Fey) 3/9 f'E) 
4 3 b-a) 2) Fea) FAIS) 14/45 FOE 
5 4 =a) N f(x) SOD 1) + f(x) 95/144) FE) 
6 5 b-a) 1 f(x) — 14 fœ) + — — 14 fx) + fxs) a/140)h FNE 
19.7 OPEN METHODS 
Recall from Fig. 19.6b that open integration formulas have limits that extend beyond the 
range of the data. Table 19.4 summarizes the Newton-Cotes open integration formulas. The 
formulas are expressed in the form of Eq. (19.13) so that the weighting factors are evident. 
As with the closed versions, successive pairs of the formulas have the same-order error. 
The even-segment—odd-point formulas are usually the methods of preference because they 
require fewer points to attain the same accuracy as the odd-segment—even-point formulas. 
The open formulas are not often used for definite integration. However, they have util- 
ity for analyzing improper integrals. In addition, they will have relevance to our discussion 
of methods for solving ordinary differential equations in Chaps. 22 and 23. 
19.8 MULTIPLE INTEGRALS 


Multiple integrals are widely used in engineering and science. For example, a general 
equation to compute the average of a two-dimensional function can be written as [recall 
Eq. (19.7)] 


f= LA fe Fey) dx) (19.31) 
~ (d-c\(b—-a) 
The numerator is called a double integral. 

The techniques discussed in this chapter (and Chap. 20) can be readily employed to 
evaluate multiple integrals. A simple example would be to take the double integral of a 
function over a rectangular area (Fig. 19.16). 

Recall from calculus that such integrals can be computed as iterated integrals: 


[ ie y) as| w=f (S fo y) dy) dx (19.32) 


Thus, the integral in one of the dimensions is evaluated first. The result of this first integra- 
tion is integrated in the second dimension. Equation (19.32) states that the order of integra- 
tion is not important. 
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EXAMPLE 19.8 


F(x, y) 4 


FIGURE 19.16 
Double integral as the area under the function surface. 


A numerical double integral would be based on the same idea. First, methods such as 
the composite trapezoidal or Simpson’s rule would be applied in the first dimension with 
each value of the second dimension held constant. Then the method would be applied to 
integrate the second dimension. The approach is illustrated in the following example. 


Using Double Integral to Determine Average Temperature 


Problem Statement. Suppose that the temperature of a rectangular heated plate is de- 
scribed by the following function: 


T (x, y) = 2xy + 2x — xX — 2y? + 72 


If the plate is 8 m long (x dimension) and 6 m wide (y dimension), compute the average 
temperature. 


Solution. First, let us merely use two-segment applications of the trapezoidal rule in each 
dimension. The temperatures at the necessary x and y values are depicted in Fig. 19.17. 
Note that a simple average of these values is 47.33. The function can also be evaluated 
analytically to yield a result of 58.66667. 

To make the same evaluation numerically, the trapezoidal rule is first implemented 
along the x dimension for each y value. These values are then integrated along the y dimen- 
sion to give the final result of 2544. Dividing this by the area yields the average tempera- 
ture as 2544/(6 x 8) = 53. 

Now we can apply a single-segment Simpson’s 1/3 rule in the same fashion. This results 
in an integral of 28 16 and an average of 58.66667, which is exact. Why does this occur? Recall 
that Simpson’s 1/3 rule yielded perfect results for cubic polynomials. Since the highest-order 
term in the function is second order, the same exact result occurs for the present case. 
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Yo 40 48 O + 2(40) + 48 
o o oog — 256 


54 + 2(70) + 54 
(8 — 0) oa er) + 496 


(8 — 0) —-> 448 


| 


256 + 2(496) + 448 
4 


72 + 2(64) + 24 
4 


(6 — 0) = 2544 


FIGURE 19.17 
Numerical evaluation of a double integral using the two-segment trapezoidal rule. 


For higher-order algebraic functions as well as transcendental functions, it would be 
necessary to use composite applications to attain accurate integral estimates. In addition, 
Chap. 20 introduces techniques that are more efficient than the Newton-Cotes formulas for 
evaluating integrals of given functions. These often provide a superior means to implement 
the numerical integrations for multiple integrals. 


19.8.1 MATLAB Functions: integral2 and integral3 


MATLAB has functions to implement both double (integral2) and triple (integral3) inte- 
gration. A simple representation of the syntax for integral2 is 


= integral2(fun, xmin, xmax, ymin, ymax) 


where qis the double integral of the function fun over the ranges from xmin to xmax and ymin 
to ymax. 

Here is an example of how this function can be used to compute the double integral 
evaluated in Example 19.7: 


>> q = integral2(@(x,y) 2*x*y+2*x-x.42-2*y.42+72,0,8,0,6) 


2816 
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LIANNE COMPUTING WORK WITH NUMERICAL INTEGRATION 


Background. The calculation of work is an important component of many areas of engi- 
neering and science. The general formula is 


Work = force x distance 


When you were introduced to this concept in high school physics, simple applications were 
presented using forces that remained constant throughout the displacement. For example, 
if a force of 10 N was used to pull a block a distance of 5 m, the work would be calculated 
as 50J (1 joule = 1 N-m). 

Although such a simple computation is useful for introducing the concept, realistic 
problem settings are usually more complex. For example, suppose that the force varies dur- 
ing the course of the calculation. In such cases, the work equation is reexpressed as 


We iL Fonds (19.33) 


where W = work (J), x) and x, = the initial and final positions (m), respectively, and F(x) = 
a force that varies as a function of position (N). If F(x) is easy to integrate, Eq. (19.33) 
can be evaluated analytically. However, in a realistic problem setting, the force might not 
be expressed in such a manner. In fact, when analyzing measured data, the force might be 
available only in tabular form. For such cases, numerical integration is the only viable op- 
tion for the evaluation. 

Further complexity is introduced if the angle between the force and the direction of 
movement also varies as a function of position (Fig. 19.18). The work equation can be 
modified further to account for this effect, as in 


W- [ FG coos (19.34) 


Again, if F(x) and @(x) are simple functions, Eq. (19.34) might be solved analytically. How- 
ever, as in Fig. 19.18, it is more likely that the functional relationship is complicated. For 
this situation, numerical methods provide the only alternative for determining the integral. 

Suppose that you have to perform the computation for the situation depicted in 
Fig. 19.18. Although the figure shows the continuous values for F(x) and @(x), assume that, 
because of experimental constraints, you are provided with only discrete measurements at 
x = 5-m intervals (Table 19.5). Use single- and composite versions of the trapezoidal rule 
and Simpson’s 1/3 and 3/8 rules to compute work for these data. 


Solution. The results of the analysis are summarized in Table 19.6. A percent relative 
error €, was computed in reference to a true value of the integral of 129.52 that was esti- 
mated on the basis of values taken from Fig. 19.18 at 1-m intervals. 

The results are interesting because the most accurate outcome occurs for the simple 
two-segment trapezoidal rule. More refined estimates using more segments, as well as 
Simpson’s rules, yield less accurate results. 

The reason for this apparently counterintuitive result is that the coarse spacing of the 
points is not adequate to capture the variations of the forces and angles. This is particularly 
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FIGURE 19.18 


The case of a variable force acting on a block. For this case the angle, as well as the 
magnitude, of the force varies. 


TABLE 19.5 Data for force F(x) and angle 8 (x) as a function of 


position x. 

x, m F(x), N 0, rad F(x) cos O 

0 0.0 0.50 0.0000 

5 9.0 1.40 15297 

10 1370 ORS 9.5120 

ILS} 14.0 0.90 8.7025 

20 105 130 2.8087 

25 120 1.48 1.0881 

30 570 1.50 Oo daS7/ 
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19.9 NEDA continued 


TABLE 19.6 Estimates of work calculated using the trapezoidal rule and 
Simpson’s rules. The percent relative error e, as computed in 
reference to a true value of the integral (129.52 Pa) that was 
estimated on the basis of values at 1-m intervals. 


Technique Segments Work Ep % 
Trapezoidal rule 1 Ro Gil 95.9 
2 Ss I) 2.84 
3 124.98 Sa SIl 
6 119.09 8.05 
Simpson’s 1/3 rule 2 W/Z Ja S 
6 T7 TS Qs 5 
Simpson’s 3/8 rule 3 SOROS 8.04 


F(x) cos [0(x)] 


FIGURE 19.19 

A continuous plot of F(x) cos [8 (x)] versus position with the seven discrete points used to de- 
velop the numerical integration estimates in Table 19.6. Notice how the use of seven points 
to characterize this continuously varying function misses two peaks at x = 2.5 and 12.5 m. 


evident in Fig. 19.19, where we have plotted the continuous curve for the product of F(x) 
and cos [@(x)]. Notice how the use of seven points to characterize the continuously varying 
function misses the two peaks at x = 2.5 and 12.5 m. The omission of these two points ef- 
fectively limits the accuracy of the numerical integration estimates in Table 19.6. The fact 
that the two-segment trapezoidal rule yields the most accurate result is due to the chance 
positioning of the points for this particular problem (Fig. 19.20). 

The conclusion to be drawn from Fig. 19.20 is that an adequate number of measure- 
ments must be made to accurately compute integrals. For the present case, if data were 
available at F(2.5) cos [9(2.5)] = 3.9007 and F(12.5) cos [0 (12.5)] = 11.3940, we could 
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Graphical depiction of why the two-segment trapezoidal rule yields a good estimate of the 
integral for this particular case. By chance, the use of two trapezoids happens to lead to an 
even balance between positive and negative errors. 


determine an improved integral estimate. For example, using the MATLAB trapz function, 


we could compute 


>a X= 025 5 1012515 202530]; 
>> y=[0 3.9007 1.5297 9.5120 11.3940 8.7025 2.8087 ... 


>> trapz(x,y) 


ais = 
132.6458 


1.088110735371]; 


Including the two additional points yields an improved integral estimate of 132.6458 
(e, = 2.16%). Thus, the inclusion of the additional data incorporates the peaks that were 
missed previously and, as a consequence, lead to better results. 


PROBLEMS 


19.1 Derive Eq. (19.4) by integrating Eq. (19.3). 
19.2 Evaluate the following integral: 


fa —e™)dx 


(a) analytically, (b) single application of the trapezoidal rule, 
(c) composite trapezoidal rule with n = 2 and 4, (d) single 
application of Simpson’s 1/3 rule, (e) composite Simpson’ s 


1/3 rule with n = 4, (£) Simpson’s 3/8 rule, and (g) com- 
posite Simpson’s rule, with n = 5. For each of the numerical 
estimates (b) through (g), determine the true percent relative 
error based on (a). 

19.3 Evaluate the following integral: 


n/2 
f (8 + 4 cos x)dx 
0 
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(a) analytically, (b) single application of the trapezoidal rule, 
(c) composite trapezoidal rule with n = 2 and 4, (d) single 
application of Simpson’s 1/3 rule, (e) composite Simpson’s 
1/3 rule with n = 4, (f) Simpson’s 3/8 rule, and (g) com- 
posite Simpson’s rule, with n = 5. For each of the numerical 
estimates (b) through (g), determine the true percent relative 
error based on (a). 


19.4 Evaluate the following integral: 
4 


1 (1 — x — 4x2 + 2x% )dx 
-2 


(a) analytically, (b) single application of the trapezoidal rule, 
(c) composite trapezoidal rule with n = 2 and 4, (d) single ap- 
plication of Simpson’s 1/3 rule, (e) Simpson’s 3/8 rule, and 
(£) Boole’s rule. For each of the numerical estimates (b) through 
(f), determine the true percent relative error based on (a). 
19.5 The function 


fœ) = e~ 
can be used to generate the following table of unequally 
spaced data: 


x 0-01 0.3 0.5 0.7 0.95 1.2 
f(x) 1 0.9048 0.7408 0.6065 0.4966 0.38670.3012 


Evaluate the integral from a = 0 to b = 1.2 using (a) analyti- 
cal means, (b) the trapezoidal rule, and (c) a combination of 
the trapezoidal and Simpson’s rules wherever possible to at- 
tain the highest accuracy. For (b) and (c), compute the true 
percent relative error. 

19.6 Evaluate the double integral 


2 p4 
J J (x? — 3y? + xy°)dx dy 
=ar 0 


(a) analytically, (b) using the composite trapezoidal rule 
with n = 2, (c) using single applications of Simpson’s 1/3 
rule, and (d) using the integral2 function. For (b) and (c), 
compute the percent relative error. 


(o) 


FIGURE P19.9 


19.7 Evaluate the triple integral 


4 p6 p3 
Í n f (x3 — 2yz) dx dy dz 
-440 F-1 


(a) analytically, (b) using single applications of Simpson’s 
1/3 rule, and (c) the integral3 function. For (b), compute the 
true percent relative error. 

19.8 Determine the distance traveled from the following 
velocity data: 


~ 


1 2 
5 6 


3.25 4.5 6 7 8 8.5 9 10 
5.5 7 8.5 8 6 7 7 5 


e 


(a) Use the trapezoidal rule. In addition, determine the aver- 
age velocity. 

(b) Fit the data with a cubic equation using polynomial 
regression. Integrate the cubic equation to determine the 
distance. 

19.9 Water exerts pressure on the upstream face of a dam as 

shown in Fig. P19.9. The pressure can be characterized by 


pC) =p8(D — 2) 

where p(z) = pressure in pascals (or N/m?) exerted at an 
elevation z meters above the reservoir bottom; p = density 
of water, which for this problem is assumed to be a constant 
10° kg/m; g = acceleration due to gravity (9.81 m/s”); and 
D = elevation (in m) of the water surface above the reser- 
voir bottom. According to Eq. (P19.9), pressure increases 
linearly with depth, as depicted in Fig. P19.9a. Omitting at- 
mospheric pressure (because it works against both sides of 
the dam face and essentially cancels out), the total force f, 
can be determined by multiplying pressure times the area of 
the dam face (as shown in Fig. P19.9b). Because both pres- 
sure and area vary with elevation, the total force is obtained 
by evaluating 


D 
f= J pgw(z)(D — z) dz 


Water exerting pressure on the upstream face of a dam: (a) side view showing force increasing 
linearly with depth; (b) front view showing width of dam in meters. 
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where w(z) = width of the dam face (m) at elevation z 
(Fig. P19.9b). The line of action can also be obtained by 
evaluating 


_ fo pgzw(eD = 2) dz 
Je pew = z) dz 


Use Simpson’s rule to compute f, and d. 
19.10 The force on a sailboat mast can be represented by the 
following function: 


d 


fo = 200 (54) ee 


where z = the elevation above the deck and H = the height of 
the mast. The total force F exerted on the mast can be deter- 
mined by integrating this function over the height of the mast: 


H 
F= 1 f(z) dz 
The line of action can also be determined by integration: 


fo zf & 
fo FO d 
(a) Use the composite trapezoidal rule to compute F and d 
for the case where H = 30 (n = 6). 
(b) Repeat (a), but use the composite Simpson’s 1/3 rule. 


19.11 A wind force distributed against the side of a sky- 
scraper is measured as 


Height /, m 0 30 60 90 120 
Force, F(l1), N/m 0 340 1200 1550 2700 
Height /, m 150 180 210 240 


Force, F(1), N/m 3100 3200 3500 3750 


Compute the net force and the line of action due to this dis- 
tributed wind. 

19.12 An 11-m beam is subjected to a load, and the shear 
force follows the equation 


V(x) = 5 + 0.2527 


where V is the shear force, and x is length in distance along 
the beam. We know that V = dM/dx, and M is the bending 
moment. Integration yields the relationship 


M=m,+ f Vdx 
0 


If M, is zero and x = 11, calculate M using (a) analytical in- 
tegration, (b) composite trapezoidal rule, and (c) composite 
Simpson’s rules. For (b) and (c) use 1-m increments. 


19.13 The total mass of a variable density rod is given by 


L 
m= J p(x) A, (x) dx 


where m = mass, p(x) = density, A(x) = cross-sectional 
area, x = distance along the rod, and L = the total length of 
the rod. The following data have been measured for a 20-m 
length rod. Determine the mass in grams to the best possible 
accuracy. 


x,m 0 4 6 8 12 16 20 
p,gicm? 4.00 3.95 3.89 3.80 3.60 3.41 3.30 
A,, em? 100 103 106 110 4120 133 150 


19.14 A transportation engineering study requires that you 
determine the number of cars that pass through an intersec- 
tion traveling during morning rush hour. You stand at the 
side of the road and count the number of cars that pass every 
4 minutes at several times as tabulated below. Use the best 
numerical method to determine (a) the total number of cars 
that pass between 7:30 and 9:15 and (b) the rate of cars 
going through the intersection per minute. (Hint: Be careful 
with units.) 


Time (hr) 7:30 7:45 8:00 8:15 8:45 9:15 
Rate (cars 


per 4 min) 18 23 14 24 20 9 


19.15 Determine the average value for the data in Fig. P19.15. 
Perform the integral needed for the average in the order shown 
by the following equation: 


= a | "Ta a dx 


19.16 Integration provides a means to compute how much 
mass enters or leaves a reactor over a specified time period, 


ti 


where ¢, and f, = the initial and final times, respectively. 
This formula makes intuitive sense if you recall the anal- 
ogy between integration and summation. Thus, the inte- 
gral represents the summation of the product of flow times 
concentration to give the total mass entering or leaving from 
t, to f. Use numerical integration to evaluate this equation 
for the data listed below: 
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(0) 
(0) 4 


FIGURE P19.15 


t, min 0 10 20 30 35 40 45 50 
Q, m?/min 4 
c, mg/m? 10 35 55 52 40 37 32 34 


19.17 The cross-sectional area of a channel can be com- 
puted as 


B 
A, = f H (y) dy 
0 


where B = the total channel width (m), H = the depth (m), 
and y = distance from the bank (m). In a similar fashion, the 
average flow Q (m*/s) can be computed as 


B 
o= f uM 


where U = water velocity (m/s). Use these relationships and 
a numerical method to determine A, and Q for the following 
data: 


y,m 0 2 4 5 6 9 
H, m 0.5 1.3 1.25 1.8 1 0.25 
U,m/s 0.03 0.06 0.05 0.13 0.11 0.02 


19.18 The average concentration of a substance @(g/m*) in 
a lake where the area A (m°) varies with depth z(m) can be 
computed by integration: 

fo CWA dz 


8 12s 


where Z = the total depth (m). Determine the average con- 
centration based on the following data: 


zm 0 4 8 12 16 
A,10°m? 9.8175 5.1051 1.9635 0.3927 0.0000 
c, g/m? 10.2 8.5 7.4 5.2 4.1 


19.19 As was done in Sec. 19.9, determine the work per- 
formed if a constant force of 1 N applied at an angle 8 results 
in the following displacements. Use the cumtrapz function to 
determine the cumulative work and plot the result versus 0. 


19.20 Compute work as described in Sec. 19.9, but use the 
following equations for F(x) and (x): 

F(x) = 1.6x — 0.045x? 

A(x) = —0.00055x° + 0.0123x7 + 0.13x 


The force is in Newtons and the angle is in radians. Perform 
the integration from x = 0 to 30 m. 


19.21 As specified in the following table, a manufactured 
spherical particle has a density that varies as a function of the 
distance from its center (r = 0): 


r, mm 0 0.12 0.24 0.36 0.49 
p(g/cm?) 6 5.81 5.14 4.29 3.39 
r, mm 0.62 0.79 0.86 0.93 1 
plgicm?®) 2.7 2.19 2.1 2.04 2 
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Use numerical integration to estimate the particle’s mass 
(in g) and average density (in g/cm’). 

19.22 As specified in the following table, the earth’s density 
varies as a function of the distance from its center (r = 0): 


r, km 0 1100 1500 2450 3400 3630 
pgiem?) 13 12.4 12 11.2 9.7 5.7 
r, km 4500 5380 6060 6280 £6380 
pigicm?) 5.2 4.7 3.6 3.4 3 


Use numerical integration to estimate the earth’s mass (in 
metric tonnes) and average density (in g/cm*). Develop ver- 
tically stacked subplots of (top) density versus radius, and 
(bottom) mass versus radius. Assume that the earth is a per- 
fect sphere. 

19.23 A spherical tank has a circular orifice in its bottom 
through which the liquid flows out (Fig. P19.23). The fol- 
lowing data is collected for the flow rate through the orifice 
as a function of time: 


4s 0 500 1000 1500 2200 2900 
Q,m?/hr 10.55 9.576 9.072 8.640 8.100 7.560 
4s 3600 4300 5200 6500 7000 7500 
Q,m*/hr 7.020 6.480 5.688 4.752 3.348 1.404 


Write a script with supporting functions (a) to estimate the 
volume of fluid (in liters) drained over the entire measure- 
ment period and (b) to estimate the liquid level in the tank at 
t=0 s. Note that r= 1.5 m. 


i 
| SS 
cam 


FIGURE P19.23 


19.24 Develop an M-file function to implement the com- 
posite Simpson’s 1/3 rule for equispaced data. Have the 
function print error messages and terminate (1) if the 
data is not equispaced or (2) if the input vectors hold- 
ing the data are not of equal length. If there are only 
2 data points, implement the trapezoidal rule. If there are 
an even number of data points n (i.e., an odd number of 
segments, n — 1), use Simpson’s 3/8 rule for the final 3 
segments. 

19.25 During a storm a high wind blows along one side 
of a rectangular skyscraper as depicted in Fig. P19.25. As 
described in Prob. 19.9, use the best lower-order Newton- 
Cotes formulas (trapezoidal, Simpsons 1/3 and 3/8 rules) 
to determine (a) the force on the building in Newtons and 
(b) the line of force in meters. 

19.26 The following data is provided for the velocity of an 
object as a function of time: 


4s 0 4 8 12 16 20 24 28 30 
v,m/s 0 18 31 42 50 56 61 65 70 


(a) Limiting yourself to trapezoidal rule and Simpson’s 1/3 
and 3/8 rules, make the best estimate of how far the 
object travels from t = 0 to 30 s? 

(b) Employ the results of (a) to compute the average velocity. 
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FIGURE P19.25 
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19.27 The total mass of a variable density rod is given by 


L 
m= 7 p(x) A, (x) dx 
0 


where m = mass, p(x) = density, A,x) = cross-sectional 
area, and x = distance along the rod. The following data has 
been measured for a 10-m length rod: 


x (m) 0 2 3 4 6 8 10 
p (g/cm?) 4.00 3.95 3.89 3.80 3.60 3.41 3.30 
A, (cm?) 100 103 106 110 120 133 150 


Determine the mass in grams to the best possible accuracy 
limiting yourself to trapezoidal rule and Simpson’s 1/3 and 
3/8 rules. 

19.28 A gas is expanded in an engine cylinder, following 
the law 


PV =c 


The initial pressure is 2550 kPa and the final pressure is 
210 kPa. If the volume at the end of expansion is 0.75 m’, 
compute the work done by the gas. 

19.29 The pressure p and volume v of a given mass of gas 
are connected by the relation 


(p + alv’)(v — b) 


where a, b, and k are constants. Express p in terms of v, 
and write a script to compute the work done by the gas in 
expanding from an initial volume to a final volume. Test 
your solution with a = 0.01, b = 0.001, initial pressure 
and volume = 100 kPa and 1 m, respectively, and final 
volume = 2 m°. 
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Numerical Integration 
of Functions 


CHAPTER OBJECTIVES 


The primary objective of this chapter is to introduce you to numerical methods for 
integrating given functions. Specific objectives and topics covered are 


Understanding how Richardson extrapolation provides a means to create a more 


accurate integral estimate by combining two less accurate estimates. 
Understanding how Gauss quadrature provides superior integral estimates by 
picking optimal abscissas at which to evaluate the function. 

Knowing how to use MATLAB’s built-in function integral to integrate functions. 


20.1 


INTRODUCTION 


In Chap. 19, we noted that functions to be integrated numerically will typically be of two 
forms: a table of values or a function. The form of the data has an important influence on 
the approaches that can be used to evaluate the integral. For tabulated information, you are 
limited by the number of points that are given. In contrast, if the function is available, you 
can generate as many values of f(x) as are required to attain acceptable accuracy. 

At face value, the composite Simpson’s 1/3 rule might seem to be a reasonable tool 
for such problems. Although it is certainly adequate for many problems, there are more 
efficient methods that are available. This chapter is devoted to three such techniques, 
which capitalize on the ability to generate function values to develop efficient schemes for 
numerical integration. 

The first technique is based on Richardson extrapolation, which is a method for 
combining two numerical integral estimates to obtain a third, more accurate value. The 
computational algorithm for implementing Richardson extrapolation in a highly efficient 
manner is called Romberg integration. This technique can be used to generate an integral 
estimate within a prespecified error tolerance. 
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20.2 


The second method is called Gauss quadrature. Recall that, in Chap. 19, values of 
f(x) for the Newton-Cotes formulas were determined at specified values of x. For example, 
if we used the trapezoidal rule to determine an integral, we were constrained to take the 
weighted average of f(x) at the ends of the interval. Gauss-quadrature formulas employ x 
values that are positioned between the integration limits in such a manner that a much more 
accurate integral estimate results. 

The third approach is called adaptive quadrature. This techniques applies composite 
Simpson’s 1/3 rule to subintervals of the integration range in a way that allows error es- 
timates to be computed. These error estimates are then used to determine whether more 
refined estimates are required for a subinterval. In this way, more refined segmentation 
is only used where it is necessary. A built-in MATLAB function that uses adaptive quadra- 
ture is illustrated. 


ROMBERG INTEGRATION 


Romberg integration is one technique that is designed to attain efficient numerical integrals 
of functions. It is quite similar to the techniques discussed in Chap. 19 in the sense that it 
is based on successive application of the trapezoidal rule. However, through mathematical 
manipulations, superior results are attained for less effort. 


20.2.1 Richardson Extrapolation 


Techniques are available to improve the results of numerical integration on the basis of the 
integral estimates themselves. Generally called Richardson extrapolation, these methods 
use two estimates of an integral to compute a third, more accurate approximation. 

The estimate and the error associated with the composite trapezoidal rule can be rep- 
resented generally as 


1=1(h) + Elh) 


where 7 = the exact value of the integral, /(h) = the approximation from an n-segment 
application of the trapezoidal rule with step size h = (b — a)/n, and E (h) = the truncation 
error. If we make two separate estimates using step sizes of h, and h, and have exact values 
for the error: 


I(h,) + E(hy) = I (h) + E(hy) (20.1) 


Now recall that the error of the composite trapezoidal rule can be represented approxi- 
mately by Eq. (19.21) [with n = (b — a)/h]: 
~ _b-a 2f” 
Ee hf (20.2) 
If it is assumed that f” is constant regardless of step size, Eq. (20.2) can be used to deter- 
mine that the ratio of the two errors will be 
Eh) hi 
~_l 20.3 
Eh) È a 
This calculation has the important effect of removing the term f” from the computation. 
In so doing, we have made it possible to utilize the information embodied by Eq. (20.2) 
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EXAMPLE 20.1 


without prior knowledge of the function’s second derivative. To do this, we rearrange 
Eq. (20.3) to give 
h,\? 
E(h,) = E(h,) ia 
2 
which can be substituted into Eq. (20.1): 


h,\2 
I(h,) + E (hy) 5 = I (h) + E(h,) 
2 


which can be solved for 
T(hy) — I (h) 
1-(h,/ hy 


Thus, we have developed an estimate of the truncation error in terms of the integral esti- 
mates and their step sizes. This estimate can then be substituted into 


[= I(h,) + E(hy) 


E (hy) = 


to yield an improved estimate of the integral: 


ad. 
(h,/hy’ — 1 


It can be shown (Ralston and Rabinowitz, 1978) that the error of this estimate is 
O(h*). Thus, we have combined two trapezoidal rule estimates of O(h’) to yield a new 
estimate of O(h*). For the special case where the interval is halved (h, = h,/2), this 
equation becomes 


1 =I (hy) + [h — ICh] 20.4) 


4 1 
1=3 1h) -31h (20.5) 


Richardson Extrapolation 
Problem Statement. Use Richardson extrapolation to evaluate the integral of f(x) = 
0.2 + 25x — 200x? + 675x? — 900x* + 400x° from a = 0 to b = 0.8. 


Solution. Single and composite applications of the trapezoidal rule can be used to evaluate 
the integral: 


Segments h Integral E 
1 0.8 0.1728 89.5% 
2 0.4 1.0688 34.9% 
4 0:2 1.4848 9.5% 


Richardson extrapolation can be used to combine these results to obtain improved estimates 
of the integral. For example, the estimates for one and two segments can be combined 
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to yield 
I= $ (1.0688) — 3 (0.1728) = 1.367467 


The error of the improved integral is E, = 1.640533 — 1.367467 = 0.273067(e, = 16.6%), 
which is superior to the estimates upon which it was based. 
In the same manner, the estimates for two and four segments can be combined to give 


I= $ (1.4848) — i (1.0688) = 1.623467 


which represents an error of E, = 1.640533 — 1.623467 = 0.017067 (e, = 1.0%). 


EXAMPLE 20.2 


Equation (20.4) provides a way to combine two applications of the trapezoidal rule 
with error O(h?) to compute a third estimate with error O(h). This approach is a subset of 
a more general method for combining integrals to obtain improved estimates. For instance, 
in Example 20.1, we computed two improved integrals of O(h*) on the basis of three trap- 
ezoidal rule estimates. These two improved integrals can, in turn, be combined to yield an 
even better value with O(h°). For the special case where the original trapezoidal estimates 
are based on successive halving of the step size, the equation used for O(h) accuracy is 


I= 1 I,- $ 1, (20.6) 


where /,, and Z, are the more and less accurate estimates, respectively. Similarly, two O(h) 


results can be combined to compute an integral that is O(n) using 


I= “ i= Pe I, (20.7) 


Higher-Order Corrections 


Problem Statement. In Example 20.1, we used Richardson extrapolation to compute two 
integral estimates of O(hô). Utilize Eq. (20.6) to combine these estimates to compute an 
integral with O(hô). 


Solution. The two integral estimates of O(h*) obtained in Example 20.1 were 1.367467 
and 1.623467. These values can be substituted into Eq. (20.6) to yield 


1 = 48 (1.623467) — 4 (1.367467) = 1.640533 


which is the exact value of the integral. 


20.2.2 The Romberg Integration Algorithm 


Notice that the coefficients in each of the extrapolation equations [Eqs. (20.5), (20.6), and 
(20.7)] add up to 1. Thus, they represent weighting factors that, as accuracy increases, 
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place relatively greater weight on the superior integral estimate. These formulations can be 
expressed in a general form that is well suited for computer implementation: 


Ary. -1, 
+1,k-1 Aj, kel 
Le — a (20.8) 
where /;,, x-1 and J;,_, = the more and less accurate integrals, respectively, and J;, = 


the improved integral. The index k signifies the level of the integration, where k = 1 
corresponds to the original trapezoidal rule estimates, k = 2 corresponds to the O(h*) 
estimates, k = 3 to the O(h°), and so forth. The index j is used to distinguish between 
the more ( j + 1) and the less (j) accurate estimates. For example, for k = 2 and j = 1, 
Eq. (20.8) becomes 


41, ~ Ty 
ar 


which is equivalent to Eq. (20.5). 

The general form represented by Eq. (20.8) is attributed to Romberg, and its system- 
atic application to evaluate integrals is known as Romberg integration. Figure 20.1 is a 
graphical depiction of the sequence of integral estimates generated using this approach. 
Each matrix corresponds to a single iteration. The first column contains the trapezoidal 
rule evaluations that are designated J;,, where j = 1 is for a single-segment application 
(step size is b — a), j = 2 is for a two-segment application [step size is (b — a)/2], j = 3 is 
for a four-segment application [step size is (b — a)/4], and so forth. The other columns of 
the matrix are generated by systematically applying Eq. (20.8) to obtain successively better 
estimates of the integral. 

For example, the first iteration (Fig. 20.1a) involves computing the one- and two- 
segment trapezoidal rule estimates (Z; , and J, ,). Equation (20.8) is then used to compute 
the element J, , = 1.367467, which has an error of O(h’). 


FIGURE 20.1 
Graphical depiction of the sequence of integral estimates generated using Romberg integra- 
tion. (a) First iteration. (b) Second iteration. (c) Third iteration. 


O(h?) O(n’) O(n’) O(h?) 


0.172800 —————> 1.367467 


(a) 1.068800 ——— 7 


0.172800 1.367467 —————® 1.640533 
1.068800 ———— 1.623467 ——— 7 


(b) 1484800 ——— 7 


0.172800 1.367467 1.640533 —————> 1.640533 
1.068800 1.623467 —————> 1.640533 ——— 
1.484800 ————> 1639467 ——— 


(c) 1600800 —— 7 
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Now, we must check to determine whether this result is adequate for our needs. As in 
other approximate methods in this book, a termination, or stopping, criterion is required 
to assess the accuracy of the results. One method that can be employed for the present 
purposes is 


Lh x = Dı 


x 100% (20.9) 
Lx 


Eal = 


where e, = an estimate of the percent relative error. Thus, as was done previously in other 
iterative processes, we compare the new estimate with a previous value. For Eq. (20.9), the 
previous value is the most accurate estimate from the previous level of integration (i.e., the 
k — 1 level of integration with j = 2). When the change between the old and new values as 
represented by £, is below a prespecified error criterion €, the computation is terminated. 
For Fig. 20.1a, this evaluation indicates the following percent change over the course of 
the first iteration: 


jea = | +267671068800) x 100% = 21.8% 


The object of the second iteration (Fig. 20.1) is to obtain the O(h°) estimate—I 13 
To do this, a four-segment trapezoidal rule estimate, /,,; = 1.4848, is determined. Then it 
is combined with /, , using Eq. (20.8) to generate /,, = 1.623467. The result is, in turn, 
combined with /,, to yield Z, ; = 1.640533. Equation (20.9) can be applied to determine 
that this result represents a change of 1.0% when compared with the previous result Z, . 

The third iteration (Fig. 20.1c) continues the process in the same fashion. In this case, 
an eight-segment trapezoidal estimate is added to the first column, and then Eq. (20.8) 
is applied to compute successively more accurate integrals along the lower diagonal. 
After only three iterations, because we are evaluating a fifth-order polynomial, the result 
(L 4 = 1.640533) is exact. 

Romberg integration is more efficient than the trapezoidal rule and Simpson’s rules. 
For example, for determination of the integral as shown in Fig. 20.1, Simpson’s 1/3 rule 
would require about a 48-segment application in double precision to yield an estimate of 
the integral to seven significant digits: 1.640533. In contrast, Romberg integration pro- 
duces the same result based on combining one-, two-, four-, and eight-segment trapezoidal 
rules—that is, with only 15 function evaluations! 

Figure 20.2 presents an M-file for Romberg integration. By using loops, this algorithm 
implements the method in an efficient manner. Note that the function uses another function 
trap to implement the composite trapezoidal rule evaluations (recall Fig. 19.10). Here is a 
MATLAB session showing how it can be used to determine the integral of the polynomial 
from Example 20.1: 


>> f=@(x) 0.2+25*x-200*x*2 +675*x43 - 900*x*4 + 400*x45; 
>> romberg(f,0,0.8) 


ans = 
1.6405 
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20.3 


function [g,ea, iter]=romberg(func,a,b,es,maxit,varargin) 
% romberg: Romberg integration quadrature 
% q= romberg(func,a,b,es,maxit,p1,p2,...): 


% Romberg integration. 
% input: 
% func = name of function to be integrated 


% a, b= integration limits 

% es = desired relative error (default = 0.000001%) 
% maxit = maximum allowable iterations (default = 30) 
% pl,p2,... = additional parameters used by func 

% output: 

% q= integral estimate 

% ea = approximate relative error (%) 

% iter = number of iterations 


if nargin<3,error('at least 3 input arguments required'),end 
if nargin<4|isempty(es), es=0.000001 ; end 
if nargin<5|isempty(maxit), maxit=50;end 
n=1; 
I(1,1) = trap(func,a,b,n,varargin{:}); 
iter = 0; 
while iter<maxit 
iter = iter+1; 
n = 2^iter; 
I(iter+1,1) = trap(func,a,b,n,varargin{:}); 
for k = 2: iter+] 


j = 2titer-k; 
ot = (4^(k-1)*I(j+1,k-1)-I(j,k-1))/(4^(k-1)-1); 
en 


ea = abs((1(1, iter+1)-I(2, iter) )/1(1, iter+1) )*100; 
if ea<=es, break; end 

end 

q = I(1, iter+1); 


FIGURE 20.2 
M-file to implement Romberg integration. 


GAUSS QUADRATURE 


In Chap. 19, we employed the Newton-Cotes equations. A characteristic of these formulas 
(with the exception of the special case of unequally spaced data) was that the integral esti- 
mate was based on evenly spaced function values. Consequently, the location of the base 
points used in these equations was predetermined or fixed. 

For example, as depicted in Fig. 20.3a, the trapezoidal rule is based on taking the area 
under the straight line connecting the function values at the ends of the integration interval. 
The formula that is used to compute this area is 


fla) + fb) 


1=b-a > 


(20.10) 
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(a) Graphical depiction of the trapezoidal rule as the area under the straight line joining fixed 
end points. (b) An improved integral estimate obtained by taking the area under the straight 
line passing through two intermediate points. By positioning these points wisely, the positive 
and negative errors are better balanced, and an improved integral estimate results. 


where a and b = the limits of integration and b — a = the width of the integration interval. 
Because the trapezoidal rule must pass through the end points, there are cases such as 
Fig. 20.3a where the formula results in a large error. 

Now, suppose that the constraint of fixed base points was removed and we were free to 
evaluate the area under a straight line joining any two points on the curve. By positioning 
these points wisely, we could define a straight line that would balance the positive and nega- 
tive errors. Hence, as in Fig. 20.3b, we would arrive at an improved estimate of the integral. 

Gauss quadrature is the name for a class of techniques to implement such a strategy. The 
particular Gauss quadrature formulas described in this section are called Gauss-Legendre 
formulas. Before describing the approach, we will show how numerical integration formulas 
such as the trapezoidal rule can be derived using the method of undetermined coefficients. 
This method will then be employed to develop the Gauss-Legendre formulas. 


20.3.1 Method of Undetermined Coefficients 


In Chap. 19, we derived the trapezoidal rule by integrating a linear interpolating polynomial 
and by geometrical reasoning. The method of undetermined coefficients offers a third ap- 
proach that also has utility in deriving other integration techniques such as Gauss quadrature. 
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RY 


(b) 


FIGURE 20.4 
Two integrals that should be evaluated exactly by the trapezoidal rule: (a) a constant and 
(b) a straight line. 


To illustrate the approach, Eq. (20.10) is expressed as 


= co f(a) + c, f(b) (20.11) 


where the c’s = constants. Now realize that the trapezoidal rule should yield exact results 
when the function being integrated is a constant or a straight line. Two simple equations 
that represent these cases are y = 1 and y = x (Fig. 20.4). Thus, the following equalities 
should hold: 


(b-a)/2 


Cote -=f 1 dx 
V ! —(b—a)/2 


and 


(b-a)/2 


6 PHB pg haa | x dx 


(b-a)/2 
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or, evaluating the integrals, 
Cotc,=b-a 
and 


b-a b-a 
z ta (0) 


These are two equations with two unknowns that can be solved for 


—Co 


b-a 
2 


which, when substituted back into Eq. (20.11), gives 


a aa 


_b-a b-a 
I= 7 fa) += fO) 


which is equivalent to the trapezoidal rule. 


20.3.2 Derivation of the Two-Point Gauss-Legendre Formula 


Just as was the case for the previous derivation of the trapezoidal rule, the object of Gauss 
quadrature is to determine the coefficients of an equation of the form 


IZ cofo) + cf) (20.12) 


where the c’s = the unknown coefficients. However, in contrast to the trapezoidal rule that 
used fixed end points a and b, the function arguments x, and x, are not fixed at the end 
points, but are unknowns (Fig. 20.5). Thus, we now have a total of four unknowns that must 
be evaluated, and consequently, we require four conditions to determine them exactly. 
Just as for the trapezoidal rule, we can obtain two of these conditions by assuming 
that Eq. (20.12) fits the integral of a constant and a linear function exactly. Then, to arrive 
at the other two conditions, we merely extend this reasoning by assuming that it also fits the 
integral of a parabolic (y = x°) and a cubic (y = x°) function. By doing this, we determine 


FIGURE 20.5 
Graphical depiction of the unknown variables x and x, for integration by Gauss quadrature. 
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fœ 


=Y 


534 NUMERICAL INTEGRATION OF FUNCTIONS 


all four unknowns and in the bargain derive a linear two-point integration formula that is 
exact for cubics. The four equations to be solved are 


1 
eta=f ldx=2 (20.13) 
-1 
1 
Coo + CX, = i x dx =0 (20.14) 
2 
CoXo + C14, = 1 X dx=3 (20.15) 
1 
ostes S xX dx=0 (20.16) 
-1 


Equations (20.13) through (20.16) can be solved simultaneously for the four unknowns. 
First, solve Eq. (20.14) for c, and substitute the result into Eq. (20.16), which can be solved for 


Since x, and x, cannot be equal, this means that x) = —x,. Substituting this result into 
Eq. (20.14) yields cy = c}. Consequently from Eq. (20.13) it follows that 


Co=cy=l 


Substituting these results into Eq. (20.15) gives 


= —0.5773503... 


1 
xo = V3 
x, = | =0.5773503... 


v3 


Therefore, the two-point Gauss-Legendre formula is 


I= sz) F H) (20.17) 


Thus, we arrive at the interesting result that the simple addition of the function values at 
x=-1/ V3 and 1 / V3 yields an integral estimate that is third-order accurate. 

Notice that the integration limits in Eqs. (20.13) through (20.16) are from —1 to 1. This 
was done to simplify the mathematics and to make the formulation as general as possible. 
A simple change of variable can be used to translate other limits of integration into this 
form. This is accomplished by assuming that a new variable x, is related to the original 
variable x in a linear fashion, as in 


X=d,+ 45x, (20.18) 
If the lower limit, x = a, corresponds to x, = —1, these values can be substituted into 
Eq. (20.18) to yield 

a=a,+a,(-l) (20.19) 


Similarly, the upper limit, x = b, corresponds to x, = 1, to give 


b=a, +a,(1) (20.20) 
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EXAMPLE 20.3 


Equations (20.19) and (20.20) can be solved simultaneously for 


b+ b- 
a= = ad a, => a (20.21) 
which can be substituted into Eq. (20.18) to yield 
nE (b +a) + (b — a)xı (20.22) 
2 
This equation can be differentiated to give 
dx =" z l dry (20.23) 


Equations (20.22) and (20.23) can be substituted for x and dx, respectively, in the equation 
to be integrated. These substitutions effectively transform the integration interval without 
changing the value of the integral. The following example illustrates how this is done in 
practice. 


Two-Point Gauss-Legendre Formula 
Problem Statement. Use Eq. (20.17) to evaluate the integral of 
f(x) = 0.2 + 25x — 200x + 675x — 900x* + 400x° 


between the limits x = 0 to 0.8. The exact value of the integral is 1.640533. 


Solution. Before integrating the function, we must perform a change of variable so that 
the limits are from —1 to +1. To do this, we substitute a = 0 and b = 0.8 into Eqs. (20.22) 
and (20.23) to yield 


x= 0.4 + 0.4x, and dx = 0.4dx, 


Both of these can be substituted into the original equation to yield 


0.8 


(0.2 + 25x — 200x? + 675x? — 900x* + 400x°) dx 


1 
= 1 [0.2 + 25(0.4 + 0.4x,) — 200(0.4 + 0.4x4)? + 675(0.4 + 0.4x,)° 
-1 


—900(0.4 + 0.4x,,)* + 400(0.4 + 0.4x, )?]0.4dx, 


Therefore, the right-hand side is in the form that is suitable for evaluation using Gauss 
quadrature. The transformed function can be evaluated at x, = —1/ V3 as 0.516741 and 
at x, = 1/ V3 as 1.305837. Therefore, the integral according to Eq. (20.17) is 0.516741 + 
1.305837 = 1.822578, which represents a percent relative error of —11.1%. This result is 
comparable in magnitude to a four-segment application of the trapezoidal rule or a single 
application of Simpson’s 1/3 and 3/8 rules. This latter result is to be expected because 
Simpson’s rules are also third-order accurate. However, because of the clever choice 
of base points, Gauss quadrature attains this accuracy on the basis of only two function 
evaluations. 
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EXAMPLE 20.4 


TABLE 20.1 Weighting factors and function arguments used in Gauss-Legendre formulas. 


Weighting Function Truncation 
Points Factors Arguments Error 
1 c= 2 Np = 0.0 xf 
2 e=1 x= -1/ V3 xf © 
c,=1 x, =1/V3 
6 
3 Cy) = 5/9 x) =- 3/5 xf H 
c, = 8/9 x, = 0.0 
c,= 5/9 = 3/5 
4 ca = (18 — V30)/36 xa = — V 525 + 70 V30/35 =f 
cı = (18 + V30)/36 x, = — V 525 — 70 V30/35 
c = (18 + V30)/36 x = V 525 — 70 V30 /35 
c, = (18 — V30)/36 x= V 525 + 70 V30/35 
5 co = (822 — 13 V70)/900 xa = — V 245 + 14 V70 /21 af @% 
c = (322 + 13 V70)/900 x, = — y 245 — 14 V70 /21 
c, = 128/225 x, = 0.0 
c, = (822 + 13 V70)/900 x, = V 245 — 14 V70 /21 
c, = (822 — 13 V70)/900 x, = V 245 + 14 V70 /21 
6 co = 0.171324492379170 Xp = —0.932469514203152 afg 
cı = 0.360761573048139 x, = —0.661209386466265 
c, = 0.467913934572691 x, = —0.238619186083197 
c, = 0.467913934572691 x, = 0.238619186083197 
c, = 0.360761573048131 x, = 0.661209386466265 
c; = 0171324492379170 x; = 0.932469514203152 


20.3.3 Higher-Point Formulas 


Beyond the two-point formula described in the previous section, higher-point versions can 
be developed in the general form 


T= cof (Xp) + ey fp) t- + Caaf En) (20.24) 


where n = the number of points. Values for c’s and x’s for up to and including the six-point 
formula are summarized in Table 20.1. 


Three-Point Gauss-Legendre Formula 


Problem Statement. Use the three-point formula from Table 20.1 to estimate the integral 
for the same function as in Example 20.3. 


Solution. According to Table 20.1, the three-point formula is 

I = 0.5555556 f(—0.7745967) + 0.8888889 f(0) + 0.5555556 f(0.7745967) 
which is equal to 

I = 0.2813013 + 0.8732444 + 0.4859876 = 1.640533 


which is exact. 
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20.4 


Because Gauss quadrature requires function evaluations at nonuniformly spaced points 
within the integration interval, it is not appropriate for cases where the function is unknown. 
Thus, it is not suited for engineering problems that deal with tabulated data. However, where 
the function is known, its efficiency can be a decided advantage. This is particularly true 
when numerous integral evaluations must be performed. 


ADAPTIVE QUADRATURE 


Although Romberg integration is more efficient than the composite Simpson’s 1/3 rule, 
both use equally spaced points. This constraint does not take into account that some func- 
tions have regions of relatively abrupt changes where more refined spacing might be 
required. Hence, to achieve a desired accuracy, fine spacing must be applied everywhere 
even though it is only needed for the regions of sharp change. Adaptive quadrature meth- 
ods remedy this situation by automatically adjusting the step size so that small steps are 
taken in regions of sharp variations and larger steps are taken where the function changes 
gradually. 


20.4.1 MATLAB M-file: quadadapt 


Adaptive quadrature methods accommodate the fact that many functions have regions of 
high variability along with other sections where change is gradual. They accomplish this 
by adjusting the step size so that small intervals are used in regions of rapid variations and 
larger intervals are used where the function changes gradually. Many of these techniques 
are based on applying the composite Simpson’s 1/3 rule to subintervals in a fashion that 
is very similar to the way in which the composite trapezoidal rule was used in Richardson 
extrapolation. That is, the 1/3 rule is applied at two levels of refinement, and the difference 
between these two levels is used to estimate the truncation error. If the truncation error is 
acceptable, no further refinement is required, and the integral estimate for the subinterval is 
deemed acceptable. If the error estimate is too large, the step size is refined and the process 
repeated until the error falls to acceptable levels. The total integral is then computed as the 
summation of the integral estimates for the subintervals. 

The theoretical basis of the approach can be illustrated for an interval x = a tox = b 
with a width of h, = b — a. A first estimate of the integral can be estimated with Simpson’s 
1/3 rule: 


h 
Ih) =| [fa + 4/0 +f] (20.25) 


where c = (a+ b)/2. 
As in Richardson extrapolation, a more refined estimate can be obtained by halving the 
step size. That is, by applying the composite Simpson’s 1/3 rule with n = 4: 


h, 
I(h,) = A [fla +4f@+2f(c) +4 fle) +F] (20.26) 


where d = (a + c)/2, e = (c + b)/2, and h, =h,/2. 
Because both /(h,) and [(h,) are estimates of the same integral, their difference pro- 
vides a measure of the error. That is, 


E = I(h,) — I(h,) (20.27) 
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In addition, the estimate and error associated with either application can be represented 
generally as 


T=1(h) + E(h) (20.28) 


where J = the exact value of the integral, (h) = the approximation from an n-segment 
application of the Simpson’s 1/3 rule with step size h = (b — a)/n, and E(h) = the corre- 
sponding truncation error. 

Using an approach similar to Richardson extrapolation, we can derive an estimate in 
the error of the more refined estimate /(h,) as a function of the difference between the two 
integral estimates: 


Eh) = Fel) ~ hy) (20.29) 
The error can then be added to /(h,) to generate an even better estimate: 
T= 1h) + Un) ~ 1h) (20.30) 


This result is equivalent to Boole’s rule (Table 19.2). 

The equations just developed can now be combined into an efficient algorithm. 
Figure 20.6 presents an M-file function that is based on an algorithm originally developed 
by Cleve Moler (2004). 

The function consists of a main calling function quadadapt along with a recursive func- 
tion qstep that actually performs the integration. The main calling function quadadapt is 
passed the function f and the integration limits a and b. After setting the tolerance, the func- 
tion evaluations required for the initial application of Simpson’s 1/3 rule [Eq. (20.25)] are 
computed. These values along with the integration limits are then passed to qstep. Within 
qstep, the remaining step sizes and function values are determined, and the two integral 
estimates [Eqs. (20.25) and (20.26)] are computed. 

At this point, the error is estimated as the absolute difference between the integral 
estimates. Depending on the value of the error, two things can then happen: 


1. Ifthe error is less or equal to the tolerance (to1), Boole’s rule is generated; the function 
terminates and passes back the result. 

2. Ifthe error is larger than the tolerance, qstep is invoked twice to evaluate each of the 
two subintervals of the current call. 


The two recursive calls in the second step represent the real beauty of this algorithm. 
They just keep subdividing until the tolerance is met. Once this occurs, their results are 
passed back up the recursive path, combining with the other integral estimates along the 
way. The process ends when the final call is satisfied and the total integral is evaluated and 
returned to the main calling function. 

It should be stressed that the algorithm in Fig. 20.6 is a stripped-down version of the 
integral function, which is the professional root-location function employed in MATLAB. 
Thus, it does not guard against failure such as cases where integrals do not exist. Nevertheless, 
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function q = quadadapt(f,a,b,tol,varargin) 

% Evaluates definite integral of f(x) from a to b 
if nargin < 4 | isempty(tol),tol = 1.e-6;end 

ES (els b)/2 

fa = feval(f,a,varargin{:}); 

fc = feval(f,c,varargin{:}); 

fb = feval(f,b,varargin{:}); 

q = quadstep(f, a, b, tol, fa, fc, fb, varargin{:}); 
end 


function q = quadstep(f,a,b, tol, fa,fc,fb,varargin) 

% Recursive subfunction used by quadadapt. 

h=b-a; c= (a+ b)/2; 

fd = feval(f,(a+c)/2,varargin{:}); 

fe = feval(f,(c+b)/2,varargin{:}); 

ql = h/6 * (fa + 4*fc + fb); 

GA = | il2 © (ire Gita) se eae AES oP aoe 

if abs(q2 - q1) <= tol 
@ Sce e (qe = yA 

else 
qa = quadstep(f, a, c, tol, fa, fd, fc, varargin{:}); 
qb = quadstep(f, c, b, tol, fc, fe, fb, varargin{:}); 


q = qa + qb; 
end 
end 
FIGURE 20.6 


An M-file to implement an adaptive quadrature algorithm based on an algorithm originally 
developed by Cleve Moler (2004). 


it works just fine for many applications, and certainly serves to illustrate how adaptive quadra- 
ture works. Here is a MATLAB session showing how quadadapt can be used to determine the 
integral of the polynomial from Example 20.1: 


>> F=@(x) 0.2+25*x-200*x*2 + 675*x43 -900*x44 + 400* x45; 
>> q = quadadapt(f,0,0.8) 


qs 
1. 640533333333336 


20.4.2 MATLAB Function: integral 
MATLAB has a function for implementing adaptive quadrature: 
q = integral(fun, a, b) 


where funis the function to be integrated, and a and b = the integration bounds. It should be 
noted that array operators .*, ./ and .^ should be used in the definition of fun. 
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EXAMPLE 20.5 Adaptive Quadrature 


Problem Statement. Use integral to integrate the following function: 


1 1 
IO Gar h00l Garrs0 
between the limits x = 0 to 1. Note that for q = 0.3, r = 0.9, and s = 6, this is the built-in humps 
function that MATLAB uses to demonstrate some of its numerical capabilities. The humps func- 
tion exhibits both flat and steep regions over a relatively short x range. Hence, it is useful for 
demonstrating and testing a function like integral. Note that the humps function can be inte- 
grated analytically between the given limits to yield an exact integral of 29.85832539549867. 


Solution. First, let’s evaluate the integral using the built-in version of humps 


>> format long 
>> Q=integral(@(x) humps(x),0,1) 


ans = 
29 .85832612842764 


Thus, the solution is correct to seven significant digits. 


yi OR of VJB wR PNA ROOT-MEAN-SQUARE CURRENT 


Background. Because it results in efficient energy transmission, the current in an AC 
circuit is often in the form of a sine wave: 


i = i peak SiN(@F) 


where i = the current (A = C/s), i,.., = the peak current (A), œ = the angular frequency 
(radians/s), and t = time (s). The angular frequency is related to the period T(s) by œ = 22/T. 

The power generated is related to the magnitude of the current. Integration can be used 
to determine the average current over one cycle: 


l peak 


if 
z Í T : A = £ Di 
i=7 f leak sin(@t) dt = T (—cos(2z) + cos(0)) = 0 


Despite the fact that the average is zero, such a current is capable of generating power. 
Therefore, an alternative to the average current must be derived. 

To do this, electrical engineers and scientists determine the root mean square current 
(A), which is calculated as 


T : 
a ee yi 2 TO nt l peak 
leZ E l Freak SİN (wt) dt = aa (20.31) 


Thus, as the name implies, the rms current is the square root of the mean of the squared 
current. Because 1 / V2 = 0.70707, in. is equal to about 70% of the peak current for our 
assumed sinusoidal wave form. 


Lims 


rms 


20.5 CASE STUDY 541 


20.5 CASE STUDY continued 


This quantity has meaning because it is directly related to the average power absorbed 
by an element in an AC circuit. To understand this, recall that Joule’s law states that the 
instantaneous power absorbed by a circuit element is equal to product of the voltage across 
it and the current through it: 


P=iV (20.32) 


where P = the power (W = J/s), and V = voltage (V = J/C). For a resistor, Ohm’s law states 
that the voltage is directly proportional to the current: 


VR (20.33) 
where R = the resistance (Q = V/A = J - s/C’). Substituting Eq. (20.33) into (20.32) gives 
P=i?R (20.34) 
The average power can be determined by integrating Eq. (20.34) over a period with the result: 


PRIR 
Thus, the AC circuit generates the equivalent power as a DC circuit with a constant current 
OIE Urn 

Now, although the simple sinusoid is widely employed, it is by no means the only 

waveform that is used. For some of these forms, such as triangular or square waves, the ims 
can be evaluated analytically with closed-form integration. However, some waveforms must 
be analyzed with numerical integration methods. 

In this case study, we will calculate the root-mean-square current of a non sinusoidal 
wave form. We will use both the Newton-Cotes formulas from Chap. 19 as well as the 
approaches described in this chapter. 


Solution. The integral that must be evaluated is 
1/2 

a h (10e~ sin 2x1)” dt (20.35) 
For comparative purposes, the exact value of this integral to fifteen significant digits is 
15.412608048 10169. 

Integral estimates for various applications of the trapezoidal rule and Simpson’s 1/3 
rule are listed in Table 20.2. Notice that Simpson’s rule is more accurate than the trapezoidal 
tule. The value for the integral to seven significant digits is obtained using a 128-segment 
trapezoidal rule or a 32-segment Simpson’s rule. 

The M-file we developed in Fig. 20.2 can be used to evaluate the integral with Romberg 
integration: 


>> format long 
>> 12=@(t) (10*exp(-t).*sin(2*pi*t)).42; 
>> [q,ea, iter ]=romberg(i2,0, .5) 
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157 
ea = 
F 
iter 


inued 


TABLE 20.2 Values for the integral calculated using Newton-Cotes 


formulas. 
Technique Segments Integral E,(%) 
Trapezoidal rule 1 ono 100.0000 
2 15.163266493 1.6178 
4 15.401429095 0.0725 
8 15.411958360 A 22 x IO 
16 15.412568151 259) se 110 
32 15.412605565 Leis 0 
64 15.412607893 1,01 « 10% 
128 15.412608038 Boss se 10" 
Simpson’s 1/3 rule 2 20.217688657 Sil, Se) 
4 15.480816629 0.4426 
8 15.415468115 0.0186 
16 15.412771415 1,06 x 10 
32 15.412618037 6.48 x 10> 


41260804288977 


-480058787326946e-008 


Thus, with the default stopping criterion of es = 1 x 10~°, we obtain a result that is correct 
to over nine significant figures in five iterations. We can obtain an even better result if we 


impose a more stringent stopping criterion: 


>> [g,ea, iter ]=romberg(i2,0,.5,1e-15) 


iter 


. 41260804810169 


SS) 


Gauss quadrature can also be used to make the same estimate. First, a change in 
variable is performed by applying Eqs. (20.22) and (20.23) to yield 


Ev 
one) 


These relationships can be substituted into Eq. (20.35) to yield 


oe 
Lims = 


+414 dt = 7 dt, 


1 
if [10e70-5+0-254) sin 22(0.25 + opse 0.25 dt 
= 


(20.36) 
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20.5 CASE STUDY Miefeyniiialerye! 


For the two-point Gauss-Legendre formula, this function is evaluated at t; = —1/ V3 and 
1/ V3, with the results being 7.684096 and 4.313728, respectively. These values can be 
substituted into Eq. (20.17) to yield an integral estimate of 11.99782, which represents an 
error of €, = 22.1%. 

The three-point formula is (Table 20.1) 


I = 0.5555556(1.237449) + 0.8888889(15.16327) + 0.5555556(2.684915) = 15.65755 


which has €, = 1.6%. The results of using the higher-point formulas are summarized in 
Table 20.3. 
Finally, the integral can be evaluated with the built-in MATLAB function integral: 


>> irms2=integral(i2,0,.5) 
irms2 = 
15. 412608049345090 


We can now compute the ims by merely taking the square root of the integral. For 
example, using the result computed with integral, we get 


>> irms=sqrt(irms2) 


irms = 
3.925889459485796 
This result could then be employed to guide other aspects of the design and operation of the 
circuit such as power dissipation computations. 

As we did for the simple sinusoid in Eq. (20.31), an interesting calculation involves 
comparing this result with the peak current. Recognizing that this is an optimization prob- 
lem, we can readily employ the fminbnd function to determine this value. Because we are 
looking for a maximum, we evaluate the negative of the function: 


>> [tmax, imax]=fminbnd(@(t) -10*exp(-t).*sin(2*pi*t) ,0,.5) 


tmax = 
0.22487940319321 
imax = 
-7 . 886853873932577 


A maximum current of 7.88685 A occurs at t = 0.2249 s. Hence, for this particular wave 
form, the root-mean-square value is about 49.8% of the maximum. 


TABLE 20.3 Results of using various-point Gauss quadrature 
formulas to approximate the integral. 


Points Estimate E, (%) 
2 11.9978243 22 oll 
3 15.6575502 1759 
4 15.4058023 4.42 x 10-7 
5 15.4126391 2.00 x 10= 
6 15.4126109 162 x 10S 
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PROBLEMS 


20.1 Use Romberg integration to evaluate 


1= f (r+ ax 


to an accuracy of £, = 0.5%. Your results should be presented 
in the format of Fig. 20.1. Use the analytical solution of the 
integral to determine the percent relative error of the result ob- 
tained with Romberg integration. Check that g, is less than e,. 
20.2 Evaluate the following integral (a) analytically, 
(b) Romberg integration (£, = 0.5%), (c) the three-point Gauss 
quadrature formula, and (d) MATLAB integral function: 


8 
I= f — 0.055x* + 0.86x° — 4.2x? + 6.3x + 2 dx 
0 


20.3 Evaluate the following integral with (a) Romberg in- 
tegration (e, = 0.5%), (b) the two-point Gauss quadrature 
formula, and (c) MATLAB integral function: 


3 
1=f xe™ dx 
0 


20.4 There is no closed form solution for the error function 


erf(a) = 2 f e% dx 


Use the (a) two-point and (b) three-point Gauss-Legendre 
formulas to estimate erf(1.5). Determine the percent relative 
error for each case based on the true value, which can be 
determined with MATLAB’s built-in function erf. 

20.5 The force on a sailboat mast can be represented by the 
following function: 


F= f “00 F e/4 dz 


where z = the elevation above the deck and H = the height of the 
mast. Compute F for the case where H = 30 using (a) Romberg 
integration to a tolerance of £, = 0.5%, (b) the two-point Gauss- 
Legendre formula, and (c) the MATLAB integral function. 
20.6 The root-mean-square current can be computed as 


T 
1 “2 
Ikus = \/ if i-(t) dt 


For T = 1, suppose that i(f) is defined as 


i(t) = 8e~!7 sin (2x $) for0 <t < T/2 


i(t) =0 for T/2<t<T 


Evaluate the Jp, using (a) Romberg integration to a toler- 
ance of 0.1%, (b) the two- and three-point Gauss-Legendre 
formulas, and (c) the MATLAB integral function. 


20.7 The heat required, AH (cal), to induce a temperature 
change, AT(°C), of a material can be computed as 


AH = mC,(T)AT 


where m = mass (g), and C7) = heat capacity [cal/(g .°C)]. 
The heat capacity increases with temperature, T (°C), 
according to 


C,(T) = 0.132 + 1.56 x 107 + 2.64 x 10°77? 


Write a script that uses the integral function to generate a 
plot of AH versus AT for cases where m = 1 kg, the starting 
temperature is —100 °C, and AT ranges from 0 to 300 °C. 
20.8 The amount of mass transported via a pipe over a pe- 
riod of time can be computed as 


M= 7 PERG dt 


where M = mass (mg), t, = the initial time (min), t, = the final 
time (min), Q(t) = flow rate (m?/min), and c(t) = concentration 
(mg/m*). The following functional representations define the 
temporal variations in flow and concentration: 


Q(t) = 9 + 5 cos?(0.41) 

c(t) = 5e 0 + 20-150 
Determine the mass transported between f, = 2 and t, = 8 min 
with (a) Romberg integration to a tolerance of 0.1% and 


(b) the MATLAB integral function. 
20.9 Evaluate the double integral 


2 p4 
[S [ 2-3 +9) aay 
-240 


(a) analytically, with the 2-point Gauss quadrature, and 
(b) with the integral2 function. 

20.10 Compute work as described in Sec. 19.9, but use the 
following equations for F(x) and 8 (x): 


F(x) = 1.6x — 0.0457 

A(x) = —0.00055x° + 0.012327 + 0.13x 
The force is in newtons and the angle is in radians. Perform 
the integration from x = 0 to 30 m. 


20.11 Perform the same computation as in Sec. 20.5, but for 
the current as specified by 


i(t) = 6e7!* sin 2at 


i(t)=0 


forO<t<T/2 
for T/2<t<T 


where T= 1s. 


PROBLEMS 


20.12 Compute the power absorbed by an element in a cir- 
cuit as described in Sec. 20.5, but for a simple sinusoidal 
current i = sin(2zt/T) where T= 1 s. 

(a) Assume that Ohm’s law holds and R = 5 Q. 

(b) Assume that Ohm’s law does not hold and that voltage 
and current are related by the following nonlinear rela- 
tionship: V = (5i — 1.250). 

20.13 Suppose that the current through a resistor is de- 

scribed by the function 


i(t) = (60 — t? + (60 — t) sin( Vt) 
and the resistance is a function of the current: 
R= 10i + 27° 


Compute the average voltage over t = 0 to 60 using the 
composite Simpson’s 1/3 rule. 

20.14 If a capacitor initially holds no charge, the voltage 
across it as a function of time can be computed as 


Vit) = if i(t) dt 


Use MATLAB to fit these data with a fifth-order polynomial. 
Then, use a numerical integration function along with a value 
of C = 10” farad to generate a plot of voltage versus time. 


UMMA 
60 
— 40} 
g 
eS 
20 
o 
(a) 


FIGURE P20.16 


545 
t,s 0 0.2 0.4 0.6 
i, 107° A 0.2 0.3683 0.3819 0.2282 
ts 0.8 1 1.2 
i,107A 0.0486 0.0082 0.1441 


20.15 The work done on an object is equal to the force times 
the distance moved in the direction of the force. The velocity 
of an object in the direction of a force is given by 


O<r<5 
S<r<15 


v=4t 
v = 20 + (5 - 0? 


where v is in m/s. Determine the work if a constant force of 
200 N is applied for all t. 

20.16 A rod subject to an axial load (Fig. P20.16a) will be 
deformed, as shown in the stress-strain curve in Fig. P20.16b. 
The area under the curve from zero stress out to the point of 
rupture is called the modulus of toughness of the material. It 
provides a measure of the energy per unit volume required 
to cause the material to rupture. As such, it is representative 
of the material’s ability to withstand an impact load. Use 
numerical integration to compute the modulus of toughness 
for the stress-strain curve seen in Fig. P20.16b. 


Rupture 


Modulus of 
toughness 


01 0.2 £ 


(a) A rod under axial loading and (b) the resulting stress-strain curve, 
where stress is in kips per square inch (10° Ib/in?), and strain is 


dimensionless. 
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FIGURE P20.17 


20.17 If the velocity distribution of a fluid flowing through 
a pipe is known (Fig. P20.17), the flow rate Q (i.e., the vol- 
ume of water passing through the pipe per unit time) can be 
computed by Q = J v dA, where v is the velocity, and A is 
the pipe’s cross-sectional area. (To grasp the meaning of this 
relationship physically, recall the close connection between 
summation and integration.) For a circular pipe, A = zr? and 
dA = 2ar dr. Therefore, 


Q= 1 v(2zr) dr 
0 


where r is the radial distance measured outward from the 
center of the pipe. If the velocity distribution is given by 


a= =) 


where ry is the total radius (in this case, 3 cm), compute Q 
using the composite trapezoidal rule. Discuss the results. 
20.18 Using the following data, calculate the work done by 
stretching a spring that has a spring constant of k = 300 N/m 
to x = 0.35 m. To do this, first fit the data with a polynomial 
and then integrate the polynomial numerically to compute 
the work: 


F,10°-N 0 0.01 0.028 0.046 
x,m 0 0.05 0.10 0.15 
F,10°-N 0.063 0.082 0.11 0.13 
x,m 0.20 0.25 0.30 0.35 


20.19 Evaluate the vertical distance traveled by a rocket if 
the vertical velocity is given by 


v=11P -5t 0<t<10 
v = 1100 — 5t 10<t<20 
v = 50t + 2(t — 20) 20 <t <30 


20.20 The upward velocity of a rocket can be computed by 
the following formula: 


= mo _ 
v=uln meal gt 


where v = upward velocity, u = velocity at which fuel is 
expelled relative to the rocket, mọ = initial mass of the rocket 
at time ¢ = 0, q = fuel consumption rate, and g = downward 
acceleration of gravity (assumed constant = 9.81 m/s”). If 
u = 1850 m/s, my = 160,000 kg, and g = 2500 kg/s, deter- 
mine how high the rocket will fly in 30 s. 

20.21 The normal distribution is defined as 


a 1 op 
fœ) Vin e 
(a) Use MATLAB to integrate this function from x = —1 to 
1 and from —2 to 2. 
(b) Use MATLAB to determine the inflection points of this 
function. 
20.22 Use Romberg integration to evaluate 


r 

2 

fe sina gy 
0 l+x 


to an accuracy of e, = 0.5%. Your results should be pre- 
sented in the form of Fig. 20.1. 

20.23 Recall that the velocity of the free-falling bungee 
jumper can be computed analytically as [Eq. (1.9)]: 


ð= (= tanh (y5 i 


where v(t) = velocity (m/s), t = time (s), g = 9.81 m/s’, 

m = mass (kg), c, = drag coefficient (kg/m). 

(a) Use Romberg integration to compute how far the jumper 
travels during the first 8 seconds of free fall given m = 
80 kg and c4 = 0.2 kg/m. Compute the answer to £, = 1%. 

(b) Perform the same computation with integral. 

20.24 Prove that Eq. (20.30) is equivalent to Boole’s rule. 

20.25 As specified in the following table, the earth’s density 

varies as a function of the distance from its center (r = 0): 


r, km 0 1100 1500 2450 3400 3630 4500 
p gicm? 13 12.4 12 11.2 9.7 5.7 5.2 
r, km 5380 6060 6280 6380 


p, glcm? 4.7 3.6 3.4 3 


Develop a script to fit these data with interp1 using the 
pchip option. Generate a plot showing the resulting fit along 
with the data points. Then use one of MATLAB’s integra- 
tion functions to estimate the earth’s mass (in metric tonnes) 
by integrating the output of the interp1 function. 

20.26 Develop an M-file function to implement Romberg 
integration based on Fig. 20.2. Test the function by using it 
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to determine the integral of the polynomial from Example 
20.1. Then use it to solve Prob. 20.1. 

20.27 Develop an M-file function to implement adaptive 
quadrature based on Fig. 20.6. Test the function by using it to 
determine the integral of the polynomial from Example 20.1. 
Then use it to solve Prob. 20.20. 

24.28 The average flow in a river channel, Q (m*/s), with an 
irregular cross-section can be computed as the integral of the 
product of velocity and depth 


B 
o= [ UOH) Ay 


where U(y) = water velocity (m/s) at distance y (m) from the 
bank, and H(y) = water depth (m/s) at distance y from the 
bank. Use integral along with spline fits of U and H to 
the following data collected at different distances across the 
channel to estimate the flow. 


y (m) H (m) Y (m) U (m/s) 
0 0 0 
0721 1.6 0.08 
: 0.78 4.1 0.61 
4.6 1.87 4.8 0.68 
6 1.44 6.1 0.55 
8.1 1.28 6.8 0.42 


9 0.2 9 0 


20.29 Use the two-point Gauss quadrature approach to es- 
timate the average value of the following function between 
a=landb=5 


2—2 
Jo)= 1+ 


20.30 Evaluate the following integral 


4 
1= f êa 
0 


(a) Analytically. 

(b) Using the MATLAB integral function. 

(c) Using Monte Carlo integration. 

20.31 The MATLAB humps function defines a curve that has 
2 maxima (peaks) of unequal height over the interval0 <x <2. 
Develop a MATLAB script to determine the integral over 
the interval with (a) the MATLAB integral function and 
(b) Monte Carlo integration. 

20.32 Evaluate the following double integral 


2 fl 
I= I 1 y Qa? + xy) dxdy 
0 43 


(a) Using a single application of Simpson’s 1/3 rule across 
each dimension. 
(b) Check your results with the integral2 function. 
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CHAPTER OBJECTIVES 


The primary objective of this chapter is to introduce you to numerical differentiation. 
Specific objectives and topics covered are 


e Understanding the application of high-accuracy numerical differentiation formulas 
for equispaced data. 


Knowing how to evaluate derivatives for unequally spaced data. 

Understanding how Richardson extrapolation is applied for numerical differentiation. 
Recognizing the sensitivity of numerical differentiation to data error. 

Knowing how to evaluate derivatives in MATLAB with the diff and gradient 
functions. 

Knowing how to generate contour plots and vector fields with MATLAB. 


YOU’VE GOT A PROBLEM 


ecall that the velocity of a free-falling bungee jumper as a function of time can be 
computed as 


v(t) = yZ tanh (yZ) (21.1) 


At the beginning of Chap. 19, we used calculus to integrate this equation to determine the 
vertical distance z the jumper has fallen after a time t. 


2) =n eos 2) 21.2) 
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Now suppose that you were given the reverse problem. That is, you were asked to 
determine velocity based on the jumper’s position as a function of time. Because it is the 
inverse of integration, differentiation could be used to make the determination: 

_ dz(t) 

v(t) = aE (21.3) 
Substituting Eq. (21.2) into Eq. (21.3) and differentiating would bring us back to Eq. (21.1). 

Beyond velocity, you might also be asked to compute the jumper’s acceleration. To 
do this, we could either take the first derivative of velocity, or the second derivative of 
displacement: 


dot) _ d*z(t) 


aA ae (21.4) 


a(t) = 


In either case, the result would be 


a(t) = g sech? (V2 Eca) (21.5) 


Although a closed-form solution can be developed for this case, there are other func- 
tions that may be difficult or impossible to differentiate analytically. Further, suppose that 
there was some way to measure the jumper’s position at various times during the fall. These 
distances along with their associated times could be assembled as a table of discrete val- 
ues. In this situation, it would be useful to differentiate the discrete data to determine the 
velocity and the acceleration. In both these instances, numerical differentiation methods 
are available to obtain solutions. This chapter will introduce you to some of these methods. 


INTRODUCTION AND BACKGROUND 


21.1.1 What Is Differentiation? 


Calculus is the mathematics of change. Because engineers and scientists must continu- 
ously deal with systems and processes that change, calculus is an essential tool of our 
profession. Standing at the heart of calculus is the mathematical concept of differentiation. 
According to the dictionary definition, to differentiate means “to mark off by differ- 
ences; distinguish; . . . to perceive the difference in or between.” Mathematically, the deriva- 
tive, which serves as the fundamental vehicle for differentiation, represents the rate of change 
of a dependent variable with respect to an independent variable. As depicted in Fig. 21.1, the 
mathematical definition of the derivative begins with a difference approximation: 


Ay _ f(x; + Ax) — fQ;) 
Ax Ax 


(21.6) 


where y and f(x) are alternative representatives for the dependent variable and x is the 
independent variable. If Ax is allowed to approach zero, as occurs in moving from 
Fig. 21.la to c, the difference becomes a derivative: 


dy _ fim FO, + Ad) — IG) 
ax = Ax>0 Ax 


(21.7) 
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YOK Ya Ya 
F(x; + Ax) 
S (x; + Ax) | 
Ay 
f'@) 
fœ fœ 
Xi wee AN x X; x, + Ax z Xi z 
— ~ —— 
N Ax 
(a) (b) (c) 
FIGURE 21.1 


The graphical definition of a derivative: as Ax approaches zero in going from (a) to (c), the difference approximation 
becomes a derivative. 


where dy/dx [which can also be designated as y’ or f’(x,)]' is the first derivative of y with 
respect to x evaluated at x; As seen in the visual depiction of Fig. 21.1c, the derivative is 
the slope of the tangent to the curve at x,. 

The second derivative represents the derivative of the first derivative, 

d*y _ d (2) 


dx? dx 


dx 


Thus, the second derivative tells us how fast the slope is changing. It is commonly referred 
to as the curvature, because a high value for the second derivative means high curvature. 
Finally, partial derivatives are used for functions that depend on more than one vari- 
able. Partial derivatives can be thought of as taking the derivative of the function at a point 
with all but one variable held constant. For example, given a function f that depends on both 
x and y, the partial derivative of f with respect to x at an arbitrary point (x, y) is defined as 


(21.8) 


o i x + Ax, y) — f(x, 
af _ jim, fO + Ax y) =f) oe 
ox Ax 
Similarly, the partial derivative of f with respect to y is defined as 
Of iim fG,y+ Ay) —f@y) (21.10) 


oy = Ay>0 Ay 


To get an intuitive grasp of partial derivatives, recognize that a function that depends on 
two variables is a surface rather than a curve. Suppose you are mountain climbing and have 
access to a function f that yields elevation as a function of longitude (the east-west oriented 


' The form dy/dx was devised by Leibnitz, whereas y’ is attributed to Lagrange. Note that Newton used the 
so-called dot notation: y. Today, the dot notation is usually used for time derivatives. 
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x axis) and latitude (the north-south oriented y axis). If you stop at a particular point (xo, Yo), 
the slope to the east would be ðf (xo, yo) /0x, and the slope to the north would be of (xo, yo) /dy. 


21.1.2 Differentiation in Engineering and Science 


The differentiation of a function has so many engineering and scientific applications that 
you were required to take differential calculus in your first year at college. Many specific 
examples of such applications could be given in all fields of engineering and science. 
Differentiation is commonplace in engineering and science because so much of our work 
involves characterizing the changes of variables in both time and space. In fact, many of 
the laws and other generalizations that figure so prominently in our work are based on the 
predictable ways in which change manifests itself in the physical world. A prime example 
is Newton’s second law, which is not couched in terms of the position of an object but 
rather in its change with respect to time. 

Aside from such temporal examples, numerous laws involving the spatial behavior of 
variables are expressed in terms of derivatives. Among the most common of these are the 
constitutive laws that define how potentials or gradients influence physical processes. For 
example, Fourier’s law of heat conduction quantifies the observation that heat flows from 
regions of high to low temperature. For the one-dimensional case, this can be expressed 
mathematically as 


dT 


m (21.11) 


q= -k 
where q(x) = heat flux (W/m°?), k = coefficient of thermal conductivity [W/(m - K)], T = 
temperature (K), and x = distance (m). Thus, the derivative, or gradient, provides a mea- 
sure of the intensity of the spatial temperature change, which drives the transfer of heat 
(Fig. 21.2). 


FIGURE 21.2 

Graphical depiction of a temperature gradient. Because heat moves “downhill” from high to 
low temperature, the flow in (a) is from left to right. However, due to the orientation of Cartesian 
coordinates, the slope is negative for this case. Thus, a negative gradient leads to a positive 
flow. This is the origin of the minus sign in Fourier’s law of heat conduction. The reverse case is 
depicted in (b), where the positive gradient leads to a negative heat flow from right to left. 


Direction of Direction of 
heat flow heat flow 


(a) (6) 
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TABLE 21.1 The one-dimensional forms of some constitutive laws commonly used in 
engineering and science. 


Law Equation Physical Area Gradient Flux Proportionality 
Fouriers law q= -2 Heat conduction Temperature Heat flux Therma 
Conductivity 
Fick’s law J= -D£ Mass diffusion Concentration Mass flux Diffusivity 
: dh ; 
Darcy’s law q= -k T Flow through Head Flow flux Hydraulic 
porous media Conductivity 
Ohm'’s law J= -0% Current flow Voltage Current flux Electrical 
Conductivity 
Newton’s T=H ds Fluids Velocity Shear Dynamic 
viscosity law Stress Viscosity 
Hookes law o=E AL Elasticity Deformation Stress Young’s 
Modulus 


Similar laws provide workable models in many other areas of engineering and science, 
including the modeling of fluid dynamics, mass transfer, chemical reaction kinetics, elec- 
tricity, and solid mechanics (Table 21.1). The ability to accurately estimate derivatives is 
an important facet of our capability to work effectively in these areas. 

Beyond direct engineering and scientific applications, numerical differentiation 
is also important in a variety of general mathematical contexts including other areas of 
numerical methods. For example, recall that in Chap. 6 the secant method was based on a 
finite-difference approximation of the derivative. In addition, probably the most important 
application of numerical differentiation involves the solution of differential equations. We 
have already seen an example in the form of Euler’s method in Chap. 1. In Chap. 24, we 
will investigate how numerical differentiation provides the basis for solving boundary- 
value problems of ordinary differential equations. 

These are just a few of the applications of differentiation that you might face regularly in 
the pursuit of your profession. When the functions to be analyzed are simple, you will nor- 
mally choose to evaluate them analytically. However, it is often difficult or impossible when 
the function is complicated. In addition, the underlying function is often unknown and de- 
fined only by measurement at discrete points. For both these cases, you must have the ability 
to obtain approximate values for derivatives, using numerical techniques as described next. 


HIGH-ACCURACY DIFFERENTIATION FORMULAS 


We have already introduced the notion of numerical differentiation in Chap. 4. Recall that 
we employed Taylor series expansions to derive finite-difference approximations of deriv- 
atives. In Chap. 4, we developed forward, backward, and centered difference approxima- 
tions of first and higher derivatives. Remember that, at best, these estimates had errors that 
were O(h’)—that is, their errors were proportional to the square of the step size. This level 
of accuracy is due to the number of terms of the Taylor series that were retained during the 
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EXAMPLE 21.1 


derivation of these formulas. We will now illustrate how high-accuracy finite-difference 
formulas can be generated by including additional terms from the Taylor series expansion. 
For example, the forward Taylor series expansion can be written as [recall Eq. (4.13)] 


as 7 Fx) 2 
PRD) =f) +F h +h + (21.12) 
which can be solved for 


, _ f&n- fœ 
f (x;) = h = 5 


! 
In Chap. 4, we truncated this result by excluding the second- and higher-derivative terms 
and were thus left with a forward-difference formula: 


fig =f SO) 


In contrast to this approach, we now retain the second-derivative term by substituting 
the following forward-difference approximation of the second derivative [recall Eq. (4.27)]: 


fy -= nw +I) ori (21.15) 
into Eq. (21.13) to yield 
I (Xia) = Jx) _ f&n) = 24 (X44) + f(x;) 
h 2h? 


h + O(h’) (21.13) 


+ O(h) (21.14) 


fœ) = h+ Oh?) (21.16) 


or, by collecting terms: 


Fan) + 4 fE) — 3f) 
2h 


f'œ)= + O(h’) (21.17) 


Notice that inclusion of the second-derivative term has improved the accuracy to 
O(h’). Similar improved versions can be developed for the backward and centered formulas 
as well as for the approximations of higher-order derivatives. The formulas are summarized 
in Fig. 21.3 through Fig. 21.5 along with the lower-order versions from Chap. 4. The fol- 
lowing example illustrates the utility of these formulas for estimating derivatives. 


High-Accuracy Differentiation Formulas 
Problem Statement. Recall that in Example 4.4 we estimated the derivative of 
f(x) = —0.1x* — 0.15x? — 0.5x* — 0.25x + 1.2 


at x = 0.5 using finite-differences and a step size of h = 0.25. The results are summarized 
in the following table. Note that the errors are based on the true value of f’(0.5) = —0.9125. 


Backward Centered Forward 
O(h) Oh?) O(h) 
Estimate -0.714 —0.934 —1.155 
£, 21.7% —2.4% —26.5% 


Repeat this computation, but employ the high-accuracy formulas from Fig. 21.3 through 
Fig. 21.5. 
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First Derivative Error 
fq = Seat Pp Oh) 
=I Cha) eo AiG) Silos 

tg) = ited + AP) — BF) bees 
Second Derivative 

sr) = 2 ; i 
E) = f Xna) an HI) Olh) 
re= —F Gigs) + 4F Sel Sf Gin) + 2f) Oth) 
Third Derivative 

we = CUA Gil ee Su Cea) TG? 
f(x) = f&n) uat f Gin) I) Be 
fe) = —3f (44) + 14 f Gi43) - = we +18 f (X41) — Sf) OUP) 
Fourth Derivative 
fx) = Sf Ging) — AF Gigs) + ee -4f Gin) HSO) Oth) 
OE —2 O a — et 26 f iso) — 14 f (x1) +3 f Œ) Oh) 

FIGURE 21.3 


Forward finite-difference formulas: two versions are presented for each derivative. The latter version 
incorporates more terms of the Taylor series expansion and is, consequently, more accurate. 


Solution. The data needed for this example are 


Xa = Tam 

X;-ı = 0.25 f(x) = 1,1035156 
x; = 0.5 fœ) = 0.925 

Xi) = 0.75 f (X41) = 0.6363281 
X2 = 1 f X2) = 0.2 


The forward difference of accuracy O(h’) is computed as (Fig. 21.3) 


i _ —0.2 + 4(0.6363281) — 3(0.925) _ _ z 
f'(0.5) = 30.25) = —0.859375 e, = 5.82% 


The backward difference of accuracy O(h’) is computed as (Fig. 21.4) 


3(0.925) — 4(1.1035156) + 1.2 _ _ 
2(0.25) E 


The centered difference of accuracy O(h*) is computed as (Fig. 21.5) 


so ay _ 70.2 + 8(0.6363281) — 8(1.1035156) + 1.2 _ E 
f'0.5) = TEA = —0.9125 e, = 0% 


f'(0.5) = 0.878125 e, =3.77% 
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First Derivative Error 


fq) Feed Oh) 


3 ya ; 
F'Gr) = f@) -4f ows Hie) Be 


Second Derivative 


f@) — 2f@_) +f Oj») 


f(x) = 7 O(h) 
fre) = HO BE 4 Af (x) — fs) Ane 
Third Derivative 

fos) = FO) -3f Œ) E KA —f OR) ve 
Pa)= CIEE E a A a EED + 3F Gia) ae 
Fourth Derivative 

fms) = fœ- AFG) +6 Ea -4f 3) +f Ea) ois 
Puja Bf (x) — 4 f(y.) + 26 ea -2 PEDEN EGD A C) ae 

FIGURE 21.4 


Backward finite-difference formulas: two versions are presented for each derivative. The latter 
version incorporates more terms of the Taylor series expansion and is, consequently, more accurate. 


As expected, the errors for the forward and backward differences are considerably 
more accurate than the results from Example 4.4. However, surprisingly, the centered dif- 
ference yields the exact derivative at x = 0.5. This is because the formula based on the 
Taylor series is equivalent to passing a fourth-order polynomial through the data points. 
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RICHARDSON EXTRAPOLATION 


To this point, we have seen that there are two ways to improve derivative estimates when 
employing finite differences: (1) decrease the step size or (2) use a higher-order formula 
that employs more points. A third approach, based on Richardson extrapolation, uses two 
derivative estimates to compute a third, more accurate, approximation. 

Recall from Sec. 20.2.1 that Richardson extrapolation provided a means to obtain an 
improved integral estimate by the formula [Eq. (20.4)] 


1 
T=I(h,) + ———_ U (h) -I (h (21.18) 
(Ay) (/hye ol I (h) — Th] 
where /(h,) and /(h,) are integral estimates using two step sizes: h, and h,. Because of its 
convenience when expressed as a computer algorithm, this formula is usually written for 
the case where h, = h,/2, as in 


1=41 dy) - 51h) (21.19) 


556 NUMERICAL DIFFERENTIATION 


First Derivative Error 


: f O =f @4) 
foe OU) 


ros —f Gin) + 8F ae Bf G1) + £2) ae 


Second Derivative 
f Gin) — A +f (1) 


a. i O) 
F = —f @%is2) T 16 f Eni) = ate + 16 f (%_1) = f (x2) Bas 
Third Derivative 

iv) — 260; DE Ce) IG. 
f"a)= OE H at Fœ) =f @-2) e 
Fa) eee CEE uat Bf Gin) -8f G2) +f Gis) a 
Fourth Derivative 

2) — 4 f i Ee (x; 
Pop EA = Gi) = AF) +f G5.) a 
fa) = S ins) + 12 f Gina) — 39 f Gin) + u A DiGi a 

FIGURE 21.5 


Centered finite-difference formulas: two versions are presented for each derivative. The 
latter version incorporates more terms of the Taylor series expansion and is, consequently, 
more accurate. 


In a similar fashion, Eq. (21.19) can be written for derivatives as 

D= $ D(h) - $ D) (21.20) 
For centered difference approximations with O(h’), the application of this formula will 
yield a new derivative estimate of O(h*). 


EXAMPLE 21.2 Richardson Extrapolation 


Problem Statement. Using the same function as in Example 21.1, estimate the first de- 
rivative at x = 0.5 employing step sizes of h, = 0.5 and h, = 0.25. Then use Eq. (21.20) to 
compute an improved estimate with Richardson extrapolation. Recall that the true value 
is —0.9125. 


Solution. The first-derivative estimates can be computed with centered differences as 
D(0.5) = O2— 12 =-10 £, = —9.6% 


and 


D(0.25) = &6363281 -1.103516 L 0.934375 e, = -24% 
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The improved estimate can be determined by applying Eq. (21.20) to give 
D = $ (-0.934375) - 4 (-1) = -0.9125 


which for the present case is exact. 


21.4 


The previous example yielded an exact result because the function being analyzed 
was a fourth-order polynomial. The exact outcome was due to the fact that Richardson 
extrapolation is actually equivalent to fitting a higher-order polynomial through the data 
and then evaluating the derivatives by centered divided differences. Thus, the present case 
matched the derivative of the fourth-order polynomial precisely. For most other functions, 
of course, this would not occur, and our derivative estimate would be improved but not 
exact. Consequently, as was the case for the application of Richardson extrapolation, the 
approach can be applied iteratively using a Romberg algorithm until the result falls below 
an acceptable error criterion. 


DERIVATIVES OF UNEQUALLY SPACED DATA 


The approaches discussed to this point are primarily designed to determine the derivative 
of a given function. For the finite-difference approximations of Sec. 21.2, the data had to be 
evenly spaced. For the Richardson extrapolation technique of Sec. 21.3, the data also had to 
be evenly spaced and generated for successively halved intervals. Such control of data spac- 
ing is usually available only in cases where we can use a function to generate a table of values. 

In contrast, empirically derived information—that is, data from experiments or field 
studies—are often collected at unequal intervals. Such information cannot be analyzed 
with the techniques discussed to this point. 

One way to handle nonequispaced data is to fit a Lagrange interpolating polynomial 
[recall Eq. (17.21)] to a set of adjacent points that bracket the location value at which you 
want to evaluate the derivative. Remember that this polynomial does not require that the 
points be equispaced. The polynomial can then be differentiated analytically to yield a 
formula that can be used to estimate the derivative. 

For example, you can fit a second-order Lagrange polynomial to three adjacent points 
(Xo Yo)» Xi Y1), and (x, y2). Differentiating the polynomial yields: 


=k — X97 X 
f') =f (xo) Bata x) +f es ani - x) 
+f a XTA (21.21) 
a =. =x) 


where x is the value at which you want to estimate the derivative. Although this equation is 
certainly more complicated than the first-derivative approximation from Fig. 21.3 through 
Fig. 21.5, it has some important advantages. First, it can provide estimates anywhere within 
the range prescribed by the three points. Second, the points themselves do not have to be 
equally spaced. Third, the derivative estimate is of the same accuracy as the centered dif- 
ference [Eq. (4.25)]. In fact, for equispaced points, Eq. (21.21) evaluated at x = x, reduces 
to Eq. (4.25). 
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Differentiating Unequally Spaced Data 


Problem Statement. As in Fig. 21.6, a temperature gradient can be measured down into the 
soil. The heat flux at the soil-air interface can be computed with Fourier’s law (Table 21.1): 


aT 

qe =0)= -k T 

where q(z) = heat flux (W /m?), k = coefficient of thermal conductivity for soil [= 0.5 W/ 

(m - K)], T = temperature (K), and z = distance measured down from the surface into the 

soil (m). Note that a positive value for flux means that heat is transferred from the air to 

the soil. Use numerical differentiation to evaluate the gradient at the soil-air interface and 
employ this estimate to determine the heat flux into the ground. 


Solution. Equation (21.21) can be used to calculate the derivative at the air-soil interface as 


2(0) — 0.0125 — 0.0375 P 2(0) — 0 — 0.0375 
(0 — 0.0125)(0 — 0.0375) (0.0125 — 0)(0.0125 — 0.0375) 


f'(0) = 13.5 


0 2(0) — 0 — 0.0125 
(0.0375 — 0)(0.0375 — 0.0125) 


= —1440 + 1440 — 133.333 = — 133.333 K/m 


+1 


which can be used to compute 


-0 = -05 X Kia W 
gz = 0) = -0.5  (-133.333 $) = 66.667 = 


FIGURE 21.6 
Temperature versus depth into the soil. 
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DERIVATIVES AND INTEGRALS FOR DATA WITH ERRORS 


Aside from unequal spacing, another problem related to differentiating empirical data is 
that these data usually include measurement error. A shortcoming of numerical differentia- 
tion is that it tends to amplify errors in the data. 
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FIGURE 21.7 


Illustration of how small data errors are amplified by numerical differentiation: (a) data with 
no error, (b) the resulting numerical differentiation of curve (a), (c) data modified slightly, and 
(d) the resulting differentiation of curve (c) manifesting increased variability. In contrast, the 
reverse operation of integration [moving from (d) to (c) by taking the area under (d)] tends to 
attenuate or smooth data errors. 


Fig. 21.7a shows smooth, error-free data that when numerically differentiated yield 
a smooth result (Fig. 21.7b). In contrast, Fig. 21.7c uses the same data, but with alternat- 
ing points raised and lowered slightly. This minor modification is barely apparent from 
Fig. 21.7c. However, the resulting effect in Fig. 21.7d is significant. 

The error amplification occurs because differentiation is subtractive. Hence, random 
positive and negative errors tend to add. In contrast, the fact that integration is a summing 
process makes it very forgiving with regard to uncertain data. In essence, as points are 
summed to form an integral, random positive and negative errors cancel out. 

As might be expected, the primary approach for determining derivatives for imprecise 
data is to use least-squares regression to fit a smooth, differentiable function to the data. In 
the absence of any other information, a lower-order polynomial regression might be a good 
first choice. Obviously, if the true functional relationship between the dependent and inde- 
pendent variable is known, this relationship should form the basis for the least-squares fit. 


PARTIAL DERIVATIVES 


Partial derivatives along a single dimension are computed in the same fashion as ordinary 
derivatives. For example, suppose that we want to determine to partial derivatives for a 
two-dimensional function f (x, y). For equally spaced data, the partial first derivatives can 
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be approximated with centered differences: 


of _ fœ + Ax, y) -fœ -— Ax, y) 
ox 2Ax 


(21.22) 


Of _ fa y+ Ay) -fœ y- Ay) 
oy 2Ay 


(21.23) 


All the other formulas and approaches discussed to this point can be applied to evaluate 
partial derivatives in a similar fashion. 

For higher-order derivatives, we might want to differentiate a function with respect to 
two or more different variables. The result is called a mixed partial derivative. For exam- 
ple, we might want to take the partial derivative of f(x, y) with respect to both independent 


variables 
af _ a (of 
a (2) (21.24) 


To develop a finite-difference approximation, we can first form a difference in x of the 
partial derivatives in y: 


0 0 
2 OF opis gy yei 
of D dy (21.25) 
oxoy 2Ax 
Then, we can use finite differences to evaluate each of the partials in y: 
f(x + Ax, y + Ay) -f (x + Ax, y — Ay) f(x- Ax, y+ Ay) -f(x -— Ax, y— Ay) 
oF 2Ay E 2Ay 
axdy 2Ax ee 
Collecting terms yields the final result 
OFf _f@+Ax y+ Ay) —f@+Ax y— Ay) —f@ — Ax, y+ Ay) +f — Ax, y Ay) (21.27) 


axdy 4AxAy 


21.7 NUMERICAL DIFFERENTIATION WITH MATLAB 


MATLAB software has the ability to determine the derivatives of data based on two built- 
in functions: diff and gradient. 


21.7.1 MATLAB Function: diff 


When it is passed a one-dimensional vector of length n, the diff function returns a vector 
of length n — 1 containing the differences between adjacent elements. As described in the 
following example, these can then be employed to determine finite-difference approxima- 
tions of first derivatives. 
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EXAMPLE 21.4 


Using diff for Differentiation 


Problem Statement. Explore how the MATLAB diff function can be employed to dif- 
ferentiate the function 


f (x) = 0.2 + 25x — 200x + 675x° — 900x* + 400x° 
from x = 0 to 0.8. Compare your results with the exact solution: 


f! (x) = 25 — 400x* + 2025x? — 3600x + 2000x* 


Solution. We can first express f (x) as an anonymous function: 
>> f=@(x) 0.2+25*x-200*x.42+675*x.*3-900*x.44+400*x.45; 


We can then generate a series of equally spaced values of the independent and dependent 
variables: 


>> x=0:0.1:0.8; 
>> y=f(x); 


The diff function can be used to determine the differences between adjacent elements of 
each vector. For example, 


>> diff (x) 


ans = 
Columns 1 through 5 
0.1000 0.1000 0.1000 0.1000 0.1000 
Columns 6 through 8 
0.1000 0.1000 0.1000 


As expected, the result represents the differences between each pair of elements of x. To 
compute divided-difference approximations of the derivative, we merely perform a vector 
division of the y differences by the x differences by entering 


>> d=diff(y)./diff(x) 


d= 
Columns 1 through 5 
10.8900 -0.0100 3.1900 8.4900 8.6900 
Columns 6 through 8 
1.3900 -11.0100 -21.3100 


Note that because we are using equally spaced values, after generating the x values, we 
could have simply performed the above computation concisely as 


>> d=diff(f(x))/0.1; 


The vector d now contains derivative estimates corresponding to the midpoint between 
adjacent elements. Therefore, in order to develop a plot of our results, we must first gener- 
ate a vector holding the x values for the midpoint of each interval: 


>> n=length(x 
>> xm=(x(1:n- 


); 
1)+x(2:n))./2; 
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FIGURE 21.8 
Comparison of the exact derivative (line) with numerical estimates (circles) computed with 


MATLAB’s diff function. 


As a final step, we can compute values for the analytical derivative at a finer level of reso- 
lution to include on the plot for comparison. 


>> xa=0:.01:.8; 
>> ya=25-400*xa+ 3*675*xa. 42-4*900*xa.43+5*400*xa. ^4; 


A plot of the numerical and analytical estimates can be generated with 
>> plot(xm,d, 'o',xa,ya) 


As displayed in Fig. 21.8, the results compare favorably for this case. 


Note that aside from evaluating derivatives, the diff function comes in handy as a 
programming tool for testing certain characteristics of vectors. For example, the following 
statement displays an error message and terminates an M-file if it determines that a vector 
x has unequal spacing: 


if any(diff(diff(x))~=0), error('unequal spacing'), end 


Another common use is to detect whether a vector is in ascending or descending 
order. For example, the following code rejects a vector that is not in ascending order 
(i.e., monotonically increasing): 


if any(diff(x)<=0), error('not in ascending order'), end 
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21.7.2 MATLAB Function: gradient 


The gradient function also returns differences. However, it does so in a manner that is more 
compatible with evaluating derivatives at the values themselves rather than in the intervals 
between values. A simple representation of its syntax is 


fx = gradient(f) 


where f = a one-dimensional vector of length n, and fx is a vector of length n containing 
differences based on f. Just as with the diff function, the first value returned is the differ- 
ence between the first and second value. However, for the intermediate values, a centered 
difference based on the adjacent values is returned 

diff, = = (21.28) 
The last value is then computed as the difference between the final two values. Hence, the 
results are akin to using centered differences for all the intermediate values, with forward 
and backward differences at the ends. 

Note that the spacing between points is assumed to be one. If the vector represents 
equally spaced data, the following version divides all the results by the interval and hence 
returns the actual values of the derivatives, 


fx = gradient(f, h) 


where h = the spacing between points. 


Using gradient for Differentiation 


Problem Statement. Use the gradient function to differentiate the same function that we 
analyzed in Example 21.4 with the diff function. 


Solution. In the same fashion as Example 21.4, we can generate a series of equally spaced 
values of the independent and dependent variables: 

>> F=@(x) 0.2+25*x-200*x.%2+675*x.43-900*x.%4+400*x.45; 

>> x=0:0.1:0.8; 

>> y=f(x); 

We can then use the gradient function to determine the derivatives as 

>> dy=gradient(y,0.1) 


dy = 
Columns 1 through 5 
10.8900 5.4400 1.5900 5.8400 8.5900 
Columns 6 through 9 
5.0400 -4.8100 -16.1600 -21.3100 


As in Example 21.4, we can generate values for the analytical derivative and display both 
the numerical and analytical estimates on a plot: 

>> xa=0:.01:.8; 

>> ya=25-400*xa+3*675*xa.^2-4*900*xa.^3+5*400*xa. ^4; 

>> plot(x,dy,'o', xa,ya) 
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FIGURE 21.9 
Comparison of the exact derivative (line) with numerical estimates (circles) computed with 
MATLAB’s gradient function. 


As displayed in Fig. 21.9, the results are not as accurate as those obtained with the diff 
function in Example 21.4. This is due to the fact that gradient employs intervals that are 
two times (0.2) as wide as for those used for diff (0.1). 


Beyond one-dimensional vectors, the gradient function is particularly well suited for 
determining the partial derivatives of matrices. For example, for a two-dimensional matrix, 
f, the function can be invoked as 


[ fx, fy] = gradient(f, h) 


where fx corresponds to the differences in the x (column) direction, and fy corresponds 
to the differences in the y (row) direction, and h = the spacing between points. If h is 
omitted, the spacing between points in both dimensions is assumed to be one. In the next 
section, we will illustrate how gradient can be used to visualize vector fields. 
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PR mere UPN VISUALIZING FIELDS 


Background. Beyond the determination of derivatives in one dimension, the gradient 
function is also quite useful for determining partial derivatives in two or more dimensions. 
In particular, it can be used in conjunction with other MATLAB functions to produce vi- 
sualizations of vector fields. 

To understand how this is done, we can return to our discussion of partial derivatives 
at the end of Sec. 21.1.1. Recall that we used mountain elevation as an example of a two- 
dimensional function. We can represent such a function mathematically as 


z=f, y) 


where z = elevation, x = distance measured along the east-west axis, and y = distance 
measured along the north-south axis. 

For this example, the partial derivatives provide the slopes in the directions of the axes. 
However, if you were mountain climbing, you would probably be much more interested in 
determining the direction of the maximum slope. If we think of the two partial derivatives 
as component vectors, the answer is provided very neatly by 

Oh OF 


Se) 


where Vfis referred to as the gradient of f. This vector, which represents the steepest slope, 
has a magnitude 


(24) + (24) 


and a direction 


a=tan = (T,] 


where 0 = the angle measured counterclockwise from the x axis. 

Now suppose that we generate a grid of points in the x-y plane and used the foregoing 
equations to draw the gradient vector at each point. The result would be a field of arrows 
indicating the steepest route to the peak from any point. Conversely, if we plotted the nega- 
tive of the gradient, it would indicate how a ball would travel as it rolled downhill from 
any point. 

Such graphical representations are so useful that MATLAB has a special function, 
called quiver, to create such plots. A simple representation of its syntax is 


quiver(x,y,u,Vv) 


where x and y are matrices containing the position coordinates and u and v are matrices 
containing the partial derivatives. The following example demonstrates the use of quiver 
to visualize a field. 

Employ the gradient function to determine to partial derivatives for the following two- 
dimensional function: 


fœ, y) =y- x- 2x — 2xy — y’ 
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from x = —2 to 2 and y = 1 to 3. Then use quiver to superimpose a vector field on a contour 
plot of the function. 


Solution. We can first express f (x, y) as an anonymous function 
Se ENK V) Y X ZA 2 ek VV 


A series of equally spaced values of the independent and dependent variables can be gener- 
ated as 


>> [x,y]=meshgrid(-2:.25:0, 1:.25:3); 
>> z=f(x,y); 


The gradient function can be employed to determine the partial derivatives: 
>> [fx,fy]=gradient(z,0.25); 

We can then develop a contour plot of the results: 
>> cs=contour(x,y,z);clabel(cs);hold on 


As a final step, the resultant of the partial derivatives can be superimposed as vectors on 
the contour plot: 


>> quiver(x,y,-fx,-fy);hold off 


FIGURE 21.10 
MATLAB generated contour plot of a two-dimensional function with the resultant of the 
partial derivatives displayed as arrows. 
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Note that we have displayed the negative of the resultants, in order that they point 


“downhill.” 


The result is shown in Fig. 21.10. The function’s peak occurs at x = —1 and y = 1.5 
and then drops away in all directions. As indicated by the lengthening arrows, the gradient 
drops off more steeply to the northeast and the southwest. 


PROBLEMS 


21.1 Compute forward and backward difference approxi- 
mations of O(h) and O(h’), and central difference approxi- 
mations of O(h?) and O(h*) for the first derivative of y = sin x at 
x = 7/4 using a value of h = 2/12. Estimate the true percent 
relative error £, for each approximation. 

21.2 Use centered difference approximations to estimate the 
first and second derivatives of y = e* at x = 2 for h = 0.1. 
Employ both O(h?) and O(h*) formulas for your estimates. 
21.3 Use a Taylor series expansion to derive a centered 
finite-difference approximation to the third derivative that 
is second-order accurate. To do this, you will have to use 
four different expansions for the points x,_,, x;_,, X; and 
X;49. In each case, the expansion will be around the point x;. 
The interval Ax will be used in each case of i — 1 andi + 1, 
and 2Ax will be used in each case of i — 2 and i + 2. The 
four equations must then be combined in a way to eliminate 
the first and second derivatives. Carry enough terms along 
in each expansion to evaluate the first term that will be trun- 
cated to determine the order of the approximation. 

21.4 Use Richardson extrapolation to estimate the first de- 
rivative of y = cos x at x = 2/4 using step sizes of h= 2/3 
and h, = z/6. Employ centered differences of O(h’) for the 
initial estimates. 

21.5 Repeat Prob. 21.4, but for the first derivative of In x at 
x= 5 using h, =2 andh,=1. 

21.6 Employ Eq. (21.21) to determine the first derivative 
of y = 2x* — 6x? — 12x — 8 at x = 0 based on values at 
Xj = —0.5, x, = 1, and x, = 2. Compare this result with the 
true value and with an estimate obtained using a centered 
difference approximation based on h = 1. 

21.7 Prove that for equispaced data points, Eq. (21.21) 
reduces to Eq. (4.25) at x =x). 

21.8 Develop an M-file to apply a Romberg algorithm to 
estimate the derivative of a given function. 


21.9 Develop an M-file to obtain first-derivative estimates 
for unequally spaced data. Test it with the following data: 


x 0.6 1.5 1.6 
f(x) 0.9036 0.3734 0.3261 


239 
0.08422 


3.5 
0.01596 


2 


where f(x) = 5e™ x. Compare your results with the true 
derivatives. 

21.10 Develop an M-file function that computes first and 
second derivative estimates of order O(h?) based on the for- 
mulas in Fig. 21.3 through Fig. 21.5. The function’s first line 
should be set up as 


function [dydx, d2ydx2] = diffeq(x,y) 


where x and y are input vectors of length n containing the 
values of the independent and dependent variables, respec- 
tively, and dydx and dy2dx2 are output vectors of length n 
containing the first- and second-derivative estimates at 
each value of the independent variable. The function should 
generate a plot of dydx and dy2dx2 versus x. Have your 
M-file return an error message if (a) the input vectors are not 
the same length or (b) the values for the independent vari- 
able are not equally spaced. Test your program with the data 
from Prob. 21.11. 

21.11 The following data were collected for the distance 
traveled versus time for a rocket: 


125 
100 


4s 0 25 50 75 100 
y, km 0 32 58 78 92 


Use numerical differentiation to estimate the rocket’s veloc- 
ity and acceleration at each time. 


oa 
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21.12 A jet fighter’s position on an aircraft carrier’s runway 
was timed during landing: 


0.52 
185 


1.04 
208 


1.75 
249 


2.37 
261 


3.25 
271 


3.83 
273 


4s 0 
xm 153 


where x is the distance from the end of the carrier. Estimate 
(a) velocity (dx/dt) and (b) acceleration (dv/dt) using nu- 
merical differentiation. 

21.13 Use the following data to find the velocity and accel- 
eration at t = 10 seconds: 


Time, 4s 0 2 4 6 8 10 12 14 16 
Position,x,m 0 0.7 1.8 3.4 5.16.3 7.3 8.0 8.4 


Use second-order correct (a) centered finite-difference, 
(b) forward finite-difference, and (c) backward finite- 
difference methods. 

21.14 A plane is being tracked by radar, and data are taken 
every second in polar coordinates @ and r. 


ts 200 202 204 +206 £208 210 
Ø, (rad) 0.75 0.72 0.70 0.68 0.67 0.66 
rm 5120 5370 5560 5800 6030 6240 


At 206 seconds, use the centered finite-difference (second- 
order correct) to find the vector expressions for velocity 0 and 
acceleration g. The velocity and acceleration given in polar 
coordinates are 


v=ré,+rOé, and G=(F—r%2.4+ (rb +272, 


21.15 Use regression to estimate the acceleration at each 
time for the following data with second-, third-, and fourth- 
order polynomials. Plot the results: 


t 1 2 3.25 4.5 6 7 8 8.5 9.3 10 
o 10 12 11 14 17 16 12 14 14 10 


21.16 The normal distribution is defined as 


1 e —x2/2 


v27 


Use MATLAB to determine the inflection points of this 
function. 


fQ)= 


21.17 The following data were generated from the normal 
distribution: 


x -2 -1.5 -1 —0.5 0 
f(x) 0.05399 0.12952 0.24197 0.35207 0.39894 


x 0.5 1 1.5 2 
f(x) 0.35207 0.24197 0.12952 0.05399 


Use MATLAB to estimate the inflection points of these data. 
21.18 Use the diff(y) command to develop a MATLAB 
M-file function to compute finite-difference approximations 
to the first and second derivative at each x value in the table 
below. Use finite-difference approximations that are second- 
order correct, O(x*): 


x 0 1 2 3 4 5 6 7 8 9 10 
y 1.4 2.1 3.3 4.8 6.8 6.6 8.6 7.5 8.9 10.9 10 


21.19 The objective of this problem is to compare second- 
order accurate forward, backward, and centered finite- 
difference approximations of the first derivative of a function 
to the actual value of the derivative. This will be done for 


fa) =e™ -x 


(a) Use calculus to determine the correct value of the de- 
rivative at x = 2. 

(b) Develop an M-file function to evaluate the centered 
finite-difference approximations, starting with x = 0.5. 
Thus, for the first evaluation, the x values for the cen- 
tered difference approximation will be x = 2 0.5 or 
x = 1.5 and 2.5. Then, decrease in increments of 0.1 
down to a minimum value of Ax = 0.01. 

(c) Repeat part (b) for the second-order forward and back- 
ward differences. (Note that these can be done at the 
same time that the centered difference is computed in 
the loop.) 

(d) Plot the results of (b) and (c) versus x. Include the exact 
result on the plot for comparison. 

21.20 You have to measure the flow rate of water through a 

small pipe. In order to do it, you place a bucket at the pipe’s 

outlet and measure the volume in the bucket as a function 

of time as tabulated below. Estimate the flow rate at t = 7 s. 


Time, s 0 1 5 8 
Volume, cm? 0 1 8 16. 


> 


21.21 The velocity v (m/s) of air flowing past a flat surface 
is measured at several distances y (m) away from the surface. 
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Use Newton’s viscosity law to determine the shear stress 
t (N/m) at the surface (y =0), 


y,m O 0.002 0.006 0.012 0.018 0.024 
u,m/s 0 0.287 0.899 1.915 3.048 4.299 
21.22 Fick’s first diffusion law states that 

Mass flux = -D £ (P21.22) 


dx 


where mass flux = the quantity of mass that passes across 
a unit area per unit time (g/cm’/s), D = a diffusion coef- 
ficient (cm7/s), c = concentration (g/cm), and x = distance 
(cm). An environmental engineer measures the following 
concentration of a pollutant in the pore waters of sediments 
underlying a lake (x = 0 at the sediment-water interface and 
increases downward): 


x, cm 0 1 3 
c, 107° g/cm? 0.06 0.32 0. 


fe>) 


Use the best numerical differentiation technique available 
to estimate the derivative at x = 0. Employ this estimate in 
conjunction with Eq. (P21.22) to compute the mass flux of 
pollutant out of the sediments and into the overlying waters 
(D = 1.52 x 107° cm?/s). For a lake with 3.6 x 10° m° of 
sediments, how much pollutant would be transported into 
the lake over a year’s time? 

21.23 The following data were collected when a large oil 
tanker was loading: 


t, min 0 10 20 30 45 60 75 
V, 10° barrels 0.4 0.7 0.77 0.88 1.051.171.35 


Calculate the flow rate Q (i.e., dV/dt) for each time to the 
order of h’. 
21.24 Fourier’s law is used routinely by architectural engi- 
neers to determine heat flow through walls. The following 
temperatures are measured from the surface (x = 0) into a 
stone wall: 


0 0.08 0.16 
°C 20.2 17 15 


AS 
3 


If the flux at x = 0 is 60 W/m’, compute k. 


21.25 The horizontal surface area A, (m°) of a lake at a partic- 
ular depth can be computed from volume by differentiation: 


where V = volume (m°) and z = depth (m) as measured from 
the surface down to the bottom. The average concentration 
of a substance that varies with depth, @ (g/m*), can be com- 
puted by integration: 


= fy C@A, (2 dz 
fA, @ dz 


where Z = the total depth (m). Determine the average con- 
centration based on the following data: 


zm 0 4 8 12 16 
V,10°m? 9.8175 5.1051 1.9635 0.3927 0.0000 
c, g/m? 10.2 8.5 7.4 5.2 4.1 


21.26 Faraday’s law characterizes the voltage drop across 
an inductor as 


di 
=L= 
ee 
where V, = voltage drop (V), L = inductance (in henrys; 
1 H=1V-s/A), i= current (A), and t = time (s). Determine 
the voltage drop as a function of time from the following 
data for an inductance of 4 H. 


~ 
oo 
7S 
an 
ao a 
So 
E 
S 
oS 
o 
Fw 
o 
o 
No 
on 


21.27 Based on Faraday’s law (Prob. 21.26), use the follow- 
ing voltage data to estimate the inductance if a current of 2 A 
is passed through the inductor over 400 milliseconds. 


0 10 20 40 60 80 120 180 280 400 
0 18 29 44 49 46 35 26 15 7 


t, ms 
V, volts 


21.28 The rate of cooling of a body (Fig. P21.28) can be 
expressed as 


where T = temperature of the body (°C), T, = temperature of 
the surrounding medium (°C), and k = a proportionality con- 
stant (per minute). Thus, this equation (called Newton’s law 
of cooling) specifies that the rate of cooling is proportional to 
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FIGURE P21.28 


the difference in the temperatures of the body and of the sur- 
rounding medium. If a metal ball heated to 80 °C is dropped 
into water that is held constant at T, = 20 °C, the temperature 
of the ball changes, as in 


Time, min 0 5 10 15 20 25 
T, °C 80 44.5 30.0 24.1 21.7 20.7 


Utilize numerical differentiation to determine dT/dt at each 
value of time. Plot dT/dt versus T — T, and employ linear 
regression to evaluate k. 

21.29 The enthalpy of a real gas is a function of pressure as 
described below. The data were taken for a real fluid. Esti- 
mate the enthalpy of the fluid at 400 K and 50 atm (evaluate 
the integral from 0.1 to 50 atm). 


av 
H= (v - r(Æ) )aP 
0 OT! Pp 
VL 
P, atm T=350K T=400K T=450K 
0.1 220 250 282.5 
5 4.1 4.7 5,23 
10 2.2 2.5 2.7 
20 1.35 1.49 1.55 
25 1.1 1.2 1.24 
30 0.90 0.99 1.03 
40 0.68 0.75 0.78 
45 0.61 0.675 0.7 
50 0.54 0.6 0.62 


21.30 For fluid flow over a surface, the heat flux to the 
surface can be computed with Fourier’s law: y = distance 
normal to the surface (m). The following measurements are 


made for air flowing over a flat plate where y = distance 
normal to the surface: 


y, cm 0 1 3 5 
TK 900 480 270 210 


If the plate’s dimensions are 200 cm long and 50 cm wide, 
and k = 0.028 J/(s -m - K), (a) determine the flux at the sur- 
face and (b) the heat transfer in watts. Note that 1 J = 1 W-s. 
21.31 The pressure gradient for laminar flow through a con- 
stant radius tube is given by 


dp __8uO 
dx art 


where p = pressure (N/m), x = distance along the tube’s 

centerline (m), y = dynamic viscosity (N-s/m*), Q = flow 

(m*/s), and r = radius (m). 

(a) Determine the pressure drop for a 10-cm length tube 
for a viscous liquid (u = 0.005 N-s/m/, density = p = 
1 x 10° kg/m?) with a flow of 10 x 107% m/s and the 
following varying radii along its length: 


xem 0 2 4 5 6 7 10 
rmm 2 1.35 1.34 1.6 1.58 1.42 2 


(b) Compare your result with the pressure drop that would 
have occurred if the tube had a constant radius equal to 
the average radius. 

(c) Determine the average Reynolds number for the tube to 
verify that flow is truly laminar (Re = pvD/p < 2100 
where v = velocity). 

21.32 The following data for the specific heat of benzene 

were generated with an nth-order polynomial. Use numeri- 

cal differentiation to determine n. 


300 400 500 600 
82.888 112.136 136.933 157.744 


T,K 
C,, kJ/(kmol-K) 


T,K 
C, kJ/(kmol- K) 


700 800 900 1000 
175.036 189.273 200.923 210.450 


21.33 The specific heat at constant pressure c, [J/(kg - K)] of 
an ideal gas is related to enthalpy by 


where h = enthalpy (kJ/kg) and T = absolute tempera- 
ture (K). The following enthalpies are provided for carbon 
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dioxide (CO,) at several temperatures. Use these values to 
determine the specific heat in J/(kg - K) for each of the tabu- 
lated temperatures. Note that the atomic weights of carbon 
and oxygen are 12.011 and 15.9994 g/mol, respectively 


750 
29,629 


800 
32,179 


900 
37,405 


1000 
42,769 


T,K 
h, kJ/kmol 


21.34 An nth-order rate law is often used to model chemical 
reactions that solely depend on the concentration of a single 
reactant: 


where c = concentration (mole), tf = time (min), n = 
reaction order (dimensionless), and k = reaction rate 
(min! mole'™"). The differential method can be used to 
evaluate the parameters k and n. This involves applying a 
logarithmic transform to the rate law to yield, 


log(- d) = log k +n logc 
Therefore, if the nth-order rate law holds, a plot of the 
log(—dc/dt) versus log c should yield a straight line with 
a slope of n and an intercept of log k. Use the differential 
method and linear regression to determine k and n for the 
following data for the conversion of ammonium cyanate to 


t min 0 5 15 30 45 
0.750 0.594 0.420 0.291 0.223 


21.35 The sediment oxygen demand [SOD in units of 
g/(m* - d)] is an important parameter in determining the 
dissolved oxygen content of a natural water. It is mea- 
sured by placing a sediment core in a cylindrical container 
(Fig. P21.35). After carefully introducing a layer of distilled, 
oxygenated water above the sediments, the container is cov- 
ered to prevent gas transfer. A stirrer is used to mix the water 
gently, and an oxygen probe tracks how the water’s oxygen 
concentration decreases over time. The SOD can then be 
computed as 


= —q do 
SOD = -H& 


where H = the depth of water (m), o = oxygen concentration 
(g/m), and t = time (d). 


Water 


Sediments 


FIGURE P21.35 


Based on the following data and H = 0.1 m, use nu- 
merical differentiation to generate plots of (a) SOD versus 
time and (b) SOD versus oxygen concentration: 


0.125 0.25 0.375 0.5 0.625 0.75 
7.11 4.59 2.57 1.15 0.33 0.03 


td 0 
o,mg/L 10 


21.36 The following relationships can be used to analyze 
uniform beams subject to distributed loads: 


dy _ do _ 
Prec are 


M(x) M- 


-EI a Vix) = =-w(x) 


dx 

where x = distance along beam (m), y = deflection (m), 
A(x) = slope (m/m), E = modulus of elasticity (Pa = N/m°), 
I = moment of inertia (m*), M(x) = moment (N m), V(x) = 
shear (N), and w(x) = distributed load (N/m). For the case of 
a linearly increasing load (recall Fig. P5.13), the slope can 
be computed analytically as 


ox) = — 


a4 22 _ 74 
TOFTE í 5x" + 6L*x" — L’) 


(P21.36) 
Employ (a) numerical integration to compute the deflec- 
tion (in m) and (b) numerical differentiation to compute 
the moment (in N m) and shear (in N). Base your nu- 
merical calculations on values of the slope computed with 
Eq. (P21.36) at equally spaced intervals of Ax = 0.125 m 
along a 3-m beam. Use the following parameter values in 
your computation: E = 200 GPa, J = 0.0003 m*, and wọ = 
2.5 kN/cm. In addition, the deflections at the ends of the 
beam are set at y(0) = y(L) = 0. Be careful of units. 

21.37 You measure the following deflections along the 
length of a simply supported uniform beam (see Prob. 21.36) 
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xm 0 0.375 0.75 1.125 15 
yem 0 -0.2571 -0.9484 -1.9689 -3.2262 
x, m 1.875 2e20 2.625 3 

yy cm -4.6414 -6.1503 -7.7051 -9.275 


Employ numerical differentiation to compute the slope, the 
moment (in N m), the shear (in N), and the distributed load 
(in N/m). Use the following parameter values in your com- 
putation: E = 200 GPa and J = 0.0003 mf. 

21.38 Evaluate df/dx, df/dy and 0’f/(Axdy) for the follow- 
ing function at x = y = | (a) analytically and (b) numerically 
Ax = Ay = 0.0001: 


f(x, y) = 3xy + 3x - £ -3y' 


21.39 Develop a script to generate the same computations 
and plots as in Sec. 21.8, but for the following functions 
(for x = -3 to 3 and y = -3 to 3): (a) f (x, y) = etH and 
(b) f (x, y) = xe, 

21.40 Develop a script to generate the same computa- 
tions and plots as in Sec. 21.8, but for the MATLAB peaks 
function over ranges of both x and y from —3 to 3. 

21.41 The velocity (m/s) of an object at time f seconds is 
given by 


Using Richardson’s extrapolation, find the acceleration of 
the particle at time t = 5 s using h = 0.5 and 0.25. Employ 
the exact solution to compute the true percent relative error 
of each estimate. 


6.1 


Ordinary Differential 
Equations 


OVERVIEW 


The fundamental laws of physics, mechanics, electricity, and thermodynamics are usually 
based on empirical observations that explain variations in physical properties and states 
of systems. Rather than describing the state of physical systems directly, the laws are usu- 
ally couched in terms of spatial and temporal changes. These laws define mechanisms 
of change. When combined with continuity laws for energy, mass, or momentum, dif- 
ferential equations result. Subsequent integration of these differential equations results in 
mathematical functions that describe the spatial and temporal state of a system in terms of 
energy, mass, or velocity variations. As in Fig. PT6.1, the integration can be implemented 
analytically with calculus or numerically with the computer. 

The free-falling bungee jumper problem introduced in Chap. 1 is an example of the 
derivation of a differential equation from a fundamental law. Recall that Newton’s second 
law was used to develop an ODE describ- 
ing the rate of change of velocity of a fall- 
ing bungee jumper: 


@ = 9-4 (PT6.1) 
where g is the gravitational constant, m 
is the mass, and c, is a drag coefficient. 
Such equations, which are composed of 
an unknown function and its derivatives, 
are called differential equations. They are 
sometimes referred to as rate equations 
because they express the rate of change of 
a variable as a function of variables and 
parameters. 

In Eq. (PT6.1), the quantity being 
differentiated v is called the dependent 
variable. The quantity with respect to 
which v is differentiated t is called the 


independent variable. When the function 
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Physical law F=ma 
dv Cd 2 
ODE ae 

dt 5 m i 

Analytical Numerical 

(calculus) (computer) 

Solution v= Eam ( j) visi = vi + (g- o?) Ar 
Cd m m 
FIGURE PT6.1 


The sequence of events in the development and solution of ODEs for engineering and 
science. The example shown is for the velocity of the free-falling bungee jumper. 


involves one independent variable, the equation is called an ordinary differential equation 
(or ODE). This is in contrast to a partial differential equation (or PDE) that involves two 
or more independent variables. 

Differential equations are also classified as to their order. For example, Eq. (PT6.1) is 
called a first-order equation because the highest derivative is a first derivative. A second- 
order equation would include a second derivative. For example, the equation describing the 
position x of an unforced mass-spring system with damping is the second-order equation: 


m dx $ ee +kx=0 (PT6.2) 
where m is mass, c is a damping coefficient, and k is a spring constant. Similarly, an nth- 
order equation would include an nth derivative. 

Higher-order differential equations can be reduced to a system of first-order equations. 
This is accomplished by defining the first derivative of the dependent variable as a new 
variable. For Eq. (PT6.2), this is done by creating a new variable v as the first derivative 
of displacement 


= ax 
o= (PT6.3) 
where v is velocity. This equation can itself be differentiated to yield 
2 
do _ax (PT6.4) 


dt dt? 
Equations (PT6.3) and (PT6.4) can be substituted into Eq. (PT6.2) to convert it into a first- 
order equation: 


dv 


mg t +kx=0 (PT6.5) 
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As a final step, we can express Eqs. (PT6.3) and (PT6.5) as rate equations: 


a =v (PT6.6) 
a =-£y- ky (PT6.7) 


Thus, Eqs. (PT6.6) and (PT6.7) are a pair of first-order equations that are equivalent 
to the original second-order equation [Eq. (PT6.2)]. Because other nth-order differential 
equations can be similarly reduced, this part of our book focuses on the solution of first- 
order equations. 

A solution of an ordinary differential equation is a specific function of the independent 
variable and parameters that satisfies the original differential equation. To illustrate this 
concept, let us start with a simple fourth-order polynomial, 


y = —0.5x + 42° — 10x? + 8.5x + 1 (PT6.8) 
Now, if we differentiate Eq. (PT6.8), we obtain an ODE: 


2 = —2x° + 12x — 20x + 8.5 (PT6.9) 
This equation also describes the behavior of the polynomial, but in a manner different 
from Eq. (PT6.8). Rather than explicitly representing the values of y for each value of x, 
Eq. (PT6.9) gives the rate of change of y with respect to x (i.e., the slope) at every value 
of x. Figure PT6.2 shows both the function and the derivative plotted versus x. Notice how 
the zero values of the derivatives correspond to the point at which the original function is 
flat—that is, where it has a zero slope. Also, the maximum absolute values of the deriva- 
tives are at the ends of the interval where the slopes of the function are greatest. 

Although, as just demonstrated, we can determine a differential equation given the 
original function, the object here is to determine the original function given the differential 
equation. The original function then represents the solution. 

Without computers, ODEs are usually solved analytically with calculus. For example, 
Eq. (PT6.9) could be multiplied by dx and integrated to yield 


y= Jx + 12x? — 20x + 8.5) dx (PT6.10) 


The right-hand side of this equation is called an indefinite integral because the limits of 
integration are unspecified. This is in contrast to the definite integrals discussed previously 
in Part Five [compare Eq. (PT6.10) with Eq. (19.5)]. 

An analytical solution for Eq. (PT6.10) is obtained if the indefinite integral can be eval- 
uated exactly in equation form. For this simple case, it is possible to do this with the result: 


y = —0.5xf + 423 — 10x? + 8.5x + C (PT6.11) 


which is identical to the original function with one notable exception. In the course of dif- 
ferentiating and then integrating, we lost the constant value of 1 in the original equation and 
gained the value C. This C is called a constant of integration. The fact that such an arbitrary 
constant appears indicates that the solution is not unique. In fact, it is but one of an infinite 
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Plots of (a) y versus x and (b) dy/dx versus x for the function y = —0.5x* + 4x7 — 10x° + 8.5x + 1. 


number of possible functions (corresponding to an infinite number of possible values of C) 
that satisfy the differential equation. For example, Fig. PT6.3 shows six possible functions 
that satisfy Eq. (PT6.11). 

Therefore, to specify the solution completely, a differential equation is usually ac- 
companied by auxiliary conditions. For first-order ODEs, a type of auxiliary condition 
called an initial value is required to determine the constant and obtain a unique solution. 
For example, the original differential equation could be accompanied by the initial condi- 
tion that at x = 0, y = 1. These values could be substituted into Eq. (PT6.11) to determine 
C = 1. Therefore, the unique solution that satisfies both the differential equation and the 
specified initial condition is 


y = —0.5x + 4x3 — 10x + 8.5x + 1 


Thus, we have “pinned down” Eq. (PT6.11) by forcing it to pass through the initial condi- 
tion, and in so doing, we have developed a unique solution to the ODE and have come full 
circle to the original function [Eq. (PT6.8)]. 

Initial conditions usually have very tangible interpretations for differential equations 
derived from physical problem settings. For example, in the bungee jumper problem, the 
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FIGURE PT6.3 


Six possible solutions for the integral of —2x° + 12x? — 20x + 8.5. Each conforms to a different 
value of the constant of integration C. 


initial condition was reflective of the physical fact that at time zero the vertical velocity was 
zero. If the bungee jumper had already been in vertical motion at time zero, the solution 
would have been modified to account for this initial velocity. 

When dealing with an nth-order differential equation, n conditions are required to ob- 
tain a unique solution. If all conditions are specified at the same value of the independent 
variable (e.g., at x or t = 0), then the problem is called an initial-value problem. This is in 
contrast to boundary-value problems where specification of conditions occurs at different 
values of the independent variable. Chapters 22 and 23 will focus on initial-value prob- 
lems. Boundary-value problems are covered in Chap. 24. 


PART ORGANIZATION 


Chapter 22 is devoted to one-step methods for solving initial-value ODEs. As the name 
suggests, one-step methods compute a future prediction y,,,, based only on information 
at a single point y; and no other previous information. This is in contrast to multistep ap- 
proaches that use information from several previous points as the basis for extrapolating 
to a new value. 

With all but a minor exception, the one-step methods presented in Chap. 22 belong to 
what are called Runge-Kutta techniques. Although the chapter might have been organized 
around this theoretical notion, we have opted for a more graphical, intuitive approach to 
introduce the methods. Thus, we begin the chapter with Euler’s method, which has a very 
straightforward graphical interpretation. In addition, because we have already introduced 
Euler’s method in Chap. 1, our emphasis here is on quantifying its truncation error and 
describing its stability. 
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Next, we use visually oriented arguments to develop two improved versions of Euler’s 
method—the Heun and the midpoint techniques. After this introduction, we formally de- 
velop the concept of Runge-Kutta (or RK) approaches and demonstrate how the forego- 
ing techniques are actually first- and second-order RK methods. This is followed by a 
discussion of the higher-order RK formulations that are frequently used for engineering 
and scientific problem solving. In addition, we cover the application of one-step methods 
to systems of ODEs. Note that all the applications in Chap. 22 are limited to cases with a 
fixed step size. 

In Chap. 23, we cover more advanced approaches for solving initial-value problems. 
First, we describe adaptive RK methods that automatically adjust the step size in response 
to the truncation error of the computation. These methods are especially pertinent as they 
are employed by MATLAB to solve ODEs. 

Next, we discuss multistep methods. As mentioned above, these algorithms retain in- 
formation of previous steps to more effectively capture the trajectory of the solution. They 
also yield the truncation error estimates that can be used to implement step-size control. We 
describe a simple method—the non-self-starting Heun method—to introduce the essential 
features of the multistep approaches. 

Finally, the chapter ends with a description of stiff ODEs. These are both individual 
and systems of ODEs that have both fast and slow components to their solution. As a con- 
sequence, they require special solution approaches. We introduce the idea of an implicit 
solution technique as one commonly used remedy. We also describe MATLAB’s built-in 
functions for solving stiff ODEs. 

In Chap. 24, we focus on two approaches for obtaining solutions to boundary-value 
problems: the shooting and finite-difference methods. Aside from demonstrating how these 
techniques are implemented, we illustrate how they handle derivative boundary conditions 
and nonlinear ODEs. 


Initial-Value Problems 


CHAPTER OBJECTIVES 


The primary objective of this chapter is to introduce you to solving initial-value 
problems for ODEs (ordinary differential equations). Specific objectives and topics 
covered are 


e Understanding the meaning of local and global truncation errors and their 

relationship to step size for one-step methods for solving ODEs. 
Knowing how to implement the following Runge-Kutta (RK) methods for 
a single ODE: 

Euler 

Heun 

Midpoint 

Fourth-order RK 
Knowing how to iterate the corrector of Heun’s method. 
Knowing how to implement the following Runge-Kutta methods for systems 
of ODEs: 

Euler 

Fourth-order RK 


YOU’VE GOT A PROBLEM 


e started this book with the problem of simulating the velocity of a free-falling 
bungee jumper. This problem amounted to formulating and solving an ordinary 


differential equation, the topic of this chapter. Now let’s return to this problem 
and make it more interesting by computing what happens when the jumper reaches the end 


of the bungee cord. 
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To do this, we should recognize that the jumper will experience different forces de- 
pending on whether the cord is slack or stretched. If it is slack, the situation is that of free 
fall where the only forces are gravity and drag. However, because the jumper can now move 
up as well as down, the sign of the drag force must be modified so that it always tends to 
retard velocity, 


wD g — sign(v) su (22.1a) 
where v is velocity (m/s), t is time (s), g is the acceleration due to gravity (9.81 m/s”), Cq is 
the drag coefficient (kg/m), and m is mass (kg). The signum function," sign, returns a —1 
or a 1 depending on whether its argument is negative or positive, respectively. Thus, when 
the jumper is falling downward (positive velocity, sign = 1), the drag force will be nega- 
tive and hence will act to reduce velocity. In contrast, when the jumper is moving upward 
(negative velocity, sign = —1), the drag force will be positive so that it again reduces the 
velocity. 

Once the cord begins to stretch, it obviously exerts an upward force on the jumper. As 
done previously in Chap. 8, Hooke’s law can be used as a first approximation of this force. 
In addition, a dampening force should also be included to account for frictional effects as 
the cord stretches and contracts. These factors can be incorporated along with gravity and 
drag into a second force balance that applies when the cord is stretched. The result is the 
following differential equation: 


@ = g — sign(o) 5” — FL) — HP (22.1b) 
where k is the cord’s spring constant (N/m), x is vertical distance measured downward 
from the bungee jump platform (m), L is the length of the unstretched cord (m), and y is a 
dampening coefficient (N - s/m). 

Because Eq. (22.1) only holds when the cord is stretched (x > L), the spring force will 
always be negative. That is, it will always act to pull the jumper back up. The dampening 
force increases in magnitude as the jumper’s velocity increases and always acts to slow the 
jumper down. 

If we want to simulate the jumper’s velocity, we would initially solve Eq. (22.1a) 
until the cord was fully extended. Then, we could switch to Eq. (22.1b) for periods that the 
cord is stretched. Although this is fairly straightforward, it means that knowledge of the 
jumper’s position is required. This can be done by formulating another differential equa- 
tion for distance: 


aX y (22.2) 


Thus, solving for the bungee jumper’s velocity amounts to solving two ordinary dif- 
ferential equations where one of the equations takes different forms depending on the value 


! Some computer languages represent the signum function as sgn(x). As represented here, MATLAB uses the 
nomenclature sign(x). 
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22.2 


of one of the dependent variables. Chapters 22 and 23 explore methods for solving this and 
similar problems involving ODEs. 


OVERVIEW 


This chapter is devoted to solving ordinary differential equations of the form 


dy 

a f(t, y) (22.3) 
In Chap. 1, we developed a numerical method to solve such an equation for the velocity of 
the free-falling bungee jumper. Recall that the method was of the general form 


New value = old value + slope x step size 
or, in mathematical terms, 
Yar =y + bh (22.4) 


where the slope ¢ is called an increment function. According to this equation, the slope 
estimate of ġ is used to extrapolate from an old value y, to a new value y,,, over a distance 
h. This formula can be applied step by step to trace out the trajectory of the solution into 
the future. Such approaches are called one-step methods because the value of the increment 
function is based on information at a single point i. They are also referred to as Runge- 
Kutta methods after the two applied mathematicians who first discussed them in the early 
1900s. Another class of methods called multistep methods use information from several 
previous points as the basis for extrapolating to a new value. We will describe multistep 
methods briefly in Chap. 23. 

All one-step methods can be expressed in the general form of Eq. (22.4), with the only 
difference being the manner in which the slope is estimated. The simplest approach is to 
use the differential equation to estimate the slope in the form of the first derivative at t, In 
other words, the slope at the beginning of the interval is taken as an approximation of the 
average slope over the whole interval. This approach, called Euler’s method, is discussed 
next. This is followed by other one-step methods that employ alternative slope estimates 
that result in more accurate predictions. 


EULER’S METHOD 
The first derivative provides a direct estimate of the slope at t, (Fig. 22.1): 
p= Flts yi) 


where f(t;, y;) is the differential equation evaluated at t; and y;. This estimate can be sub- 
stituted into Eq. (22.1): 


Yaa = Yi +f; yA (22.5) 


This formula is referred to as Euler’s method (or the Euler-Cauchy or point-slope method). 
A new value of y is predicted using the slope (equal to the first derivative at the original 
value of f) to extrapolate linearly over the step size h (Fig. 22.1). 
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EXAMPLE 22.1 


Nk 
Predicted 
' | error 
i True 
3 
fiy t 


FIGURE 22.1 
Eulers method. 


Euler’s Method 


Problem Statement. Use Euler’s method to integrate y’ = 4e°* — 0.5y from tf = 0 to 4 
with a step size of 1. The initial condition at t = 0 is y = 2. Note that the exact solution can 
be determined analytically as 


— 4 7 08 _ ,-0.5¢ ~0.5t 
y= 13 (e e )+2e 


Solution. Equation (22.5) can be used to implement Euler’s method: 
y) = yO) + FO, 20) 

where y(0) = 2 and the slope estimate at t = 0 is 
f0, 2) = 4e° — 0.5(2) = 3 

Therefore, 
yd) =2 +301) =5 


The true solution at t = 1 is 
y= em — @ SD) 4 Qe = 6.19463 
Thus, the percent relative error is 


_ {6.19463 — 5 _ 
=| 19463 — 5| x 100% = 19.28% 


For the second step: 


y(2) =z) + fA, 51) 
= 5 + [4e°8 — 0.5(5)] (1) = 11.40216 
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TABLE 22.1 Comparison of true and numerical values of the integral 
of y’ = 4e°*’ — 0,5y, with the initial condition that y = 2 at 
t= 0. The numerical values were computed using Euler’s 
method with a step size of 1. 


t Yirue YEuler jeq (%) 
0 2.00000 2.00000 

1 6.19463 5.00000 19.28 
2 14.84392 11.40216 23.19 
3 33.67717 25.51321 24.24 
4 75.33896 56.84931 24.54 


True solution 


Euler solution 


> 
t 


| 
(0) 1 2 3 4 


FIGURE 22.2 
Comparison of the true solution with a numerical solution using Euler’s method for the integral 
of y’ = 4e°* — 0.5y from t = 0 to 4 with a step size of 1.0. The initial condition att = 0 is y = 2. 


The true solution at t = 2.0 is 14.84392 and, therefore, the true percent relative error is 
23.19%. The computation is repeated, and the results compiled in Table 22.1 and Fig. 22.2. 
Note that although the computation captures the general trend of the true solution, the 
error is considerable. As discussed in the next section, this error can be reduced by using 
a smaller step size. 


22.2.1 Error Analysis for Euler’s Method 


The numerical solution of ODEs involves two types of error (recall Chap. 4): 


1. Truncation, or discretization, errors caused by the nature of the techniques employed 
to approximate values of y. 

2. Roundoff errors caused by the limited numbers of significant digits that can be retained 
by a computer. 


The truncation errors are composed of two parts. The first is a local truncation error 
that results from an application of the method in question over a single step. The second 
is a propagated truncation error that results from the approximations produced during the 
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previous steps. The sum of the two is the total error. It is referred to as the global trunca- 
tion error. 

Insight into the magnitude and properties of the truncation error can be gained by 
deriving Euler’s method directly from the Taylor series expansion. To do this, realize that 
the differential equation being integrated will be of the general form of Eq. (22.3), where 
dy/dt = y’, and t and y are the independent and the dependent variables, respectively. If the 
solution—that is, the function describing the behavior of y—has continuous derivatives, 
it can be represented by a Taylor series expansion about a starting value (f;, y;), as in 
[recall Eq. (4.13)]: 


(n) 
+2 i 
Fal 


Yi+ı =y +yh ti k+ 


F +R, (22.6) 


where h = t,,, — t; and R, = the remainder term, defined as 


yE) pit! 
; ETSA (22.7) 
where ¢ lies somewhere in the interval from 1, to 7,,,. An alternative form can be developed 
by substituting Eq. (22.3) into Eqs. (22.6) and (22.7) to yield 


(n-1) 
an po Le) iig ova!) (22.8) 


Ju SP Ly ee 
where O(h"*') specifies that the local truncation error is proportional to the step size raised 
to the (n + 1)th power. 

By comparing Eqs. (22.5) and (22.8), it can be seen that Euler’s method corresponds to 
the Taylor series up to and including the term f(t;, y;)h. Additionally, the comparison indi- 
cates that a truncation error occurs because we approximate the true solution using a finite 
number of terms from the Taylor series. We thus truncate, or leave out, a part of the true 
solution. For example, the truncation error in Euler’s method is attributable to the remain- 
ing terms in the Taylor series expansion that were not included in Eq. (22.5). Subtracting 
Eq. (22.5) from Eq. (22.8) yields 


7 Ls ; 
E,= Ly) D Y) p2 +--+ O(h"*!) (22.9) 


where E, = the true local truncation error. For sufficiently small h, the higher-order terms 
in Eq. (22.9) are usually negligible, and the result is often represented as 


padt (22.10) 
a 2! 
or 
E, = O(h’) (22.11) 


where E, = the approximate local truncation error. 

According to Eq. (22.11), we see that the local error is proportional to the square of 
the step size and the first derivative of the differential equation. It can also be demon- 
strated that the global truncation error is O(h)—that is, it is proportional to the step size 
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(Carnahan et al., 1969). These observations lead to some useful conclusions: 


1. The global error can be reduced by decreasing the step size. 

2. The method will provide error-free predictions if the underlying function (i.e., the 
solution of the differential equation) is linear, because for a straight line the second 
derivative would be zero. 


This latter conclusion makes intuitive sense because Euler’s method uses straight-line seg- 
ments to approximate the solution. Hence, Euler’s method is referred to as a first-order 
method. 

It should also be noted that this general pattern holds for the higher-order one-step 
methods described in the following pages. That is, an nth-order method will yield perfect 
results if the underlying solution is an nth-order polynomial. Further, the local truncation 
error will be O(h"*') and the global error O(h"). 


22.2.2 Stability of Euler’s Method 


In the preceding section, we learned that the truncation error of Euler’s method depends 
on the step size in a predictable way based on the Taylor series. This is an accuracy issue. 

The stability of a solution method is another important consideration that must be 
considered when solving ODEs. A numerical solution is said to be unstable if errors grow 
exponentially for a problem for which there is a bounded solution. The stability of a par- 
ticular application can depend on three factors: the differential equation, the numerical 
method, and the step size. 

Insight into the step size required for stability can be examined by studying a very 
simple ODE: 

dy 

—=- 22.12 

de ( ) 
If y(O) = yo, calculus can be used to determine the solution as 

y=ye“ 
Thus, the solution starts at yọ and asymptotically approaches zero. 

Now suppose that we use Euler’s method to solve the same problem numerically: 


dy; 
Vier =i ru 


Substituting Eq. (22.12) gives 


Var =); — ay;h 
or 
Viet =Y; (l — ah) (22.13) 


The parenthetical quantity 1 — ah is called an amplification factor. If its absolute value is 
greater than unity, the solution will grow in an unbounded fashion. So clearly, the stability 
depends on the step size h. That is, if h > 2/a, |y; > œ as i > oo. Based on this analysis, 
Euler’s method is said to be conditionally stable. 

Note that there are certain ODEs where errors always grow regardless of the method. 
Such ODEs are called ill-conditioned. 
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Inaccuracy and instability are often confused. This is probably because (a) both repre- 
sent situations where the numerical solution breaks down and (b) both are affected by step 
size. However, they are distinct problems. For example, an inaccurate method can be very 
stable. We will return to the topic when we discuss stiff systems in Chap. 23. 


22.2.3 MATLAB M-file Function: eulode 


We have already developed a simple M-file to implement Euler’s method for the falling 
bungee jumper problem in Chap. 3. Recall from Sec. 3.6, that this function used Euler’s 
method to compute the velocity after a given time of free fall. Now, let’s develop a more 
general, all-purpose algorithm. 

Figure 22.3 shows an M-file that uses Euler’s method to compute values of the depen- 
dent variable y over a range of values of the independent variable t. The name of the function 
holding the right-hand side of the differential equation is passed into the function as the 


FIGURE 22.3 
An M-file to implement Euler’s method. 


function [t,y] = eulode(dydt,tspan,y0,h,varargin) 
% eulode: Euler ODE solver 


t = vector of independent variable 
y = vector of solution for dependent variable 


%  [t,y] = eulode(dydt,tspan,y0,h,p1,p2,...): 

% uses Euler's method to integrate an ODE 
% input: 

%  dydt = name of the M-file that evaluates the ODE 
%  tspan = [ti, tf] where ti and tf = initial and 
% final values of independent variable 

% yO = initial value of dependent variable 

% h= step size 

% pl,p2,... = additional parameters used by dydt 
% output: 

% 

% 


if nargin<4,error('at least 4 input arguments required'),end 
ti = tspan(1);tf = tspan(2); 
if ~(tf>ti),error('upper limit must be greater than lower'),end 
t= (ti:h:tf)'; n = length(t); 
% if necessary, add an additional value of t 
% so that range goes from t = ti to tf 
if t(n)<tf 
(ine) = th; 
n = n+1; 
end 
y = yO*ones(n,1); %preallocate y to improve efficiency 
for i = 1:n-1 %implement Euler's method 
y(i+1) = y(i) + dydt(t(i),y(i),varargin{:})*(t(i+1)-t(i)); 
end 
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variable dydt. The initial and final values of the desired range of the independent variable 
is passed as a vector tspan. The initial value and the desired step size are passed as y0 and 
h, respectively. 

The function first generates a vector t over the desired range of the dependent variable 
using an increment of h. In the event that the step size is not evenly divisible into the range, 
the last value will fall short of the final value of the range. If this occurs, the final value is 
added to t so that the series spans the complete range. The length of the t vector is deter- 
mined as n. In addition, a vector of the dependent variable y is preallocated with n values of 
the initial condition to improve efficiency. 

At this point, Euler’s method [Eq. (22.5)] is implemented by a simple loop: 


for i = 1:n-1 
= y(i) + dydt(t(i),y(i),varargin{:})*(t(i+1)-t(4)); 


Notice how a function is used to generate a value for the derivative at the appropriate val- 
ues of the independent and dependent variables. Also notice how the time step is automati- 
cally calculated based on the difference between adjacent values in the vector t. 

The ODE being solved can be set up in several ways. First, the differential equation can 
be defined as an anonymous function object. For example, for the ODE from Example 22.1: 


>> dydt=@(t,y) 4*exp(0.8*t) - 0.5*y; 
The solution can then be generated as 


>> [t,y] = eulode(dydt,[0 4],2,1); 
>> disp([t,y]) 


with the result (compare with Table 22.1): 


0 2.0000 
1.0000 5.0000 
2.0000 11.4022 
3.0000 25.5132 
4.0000 56.8493 


Although using an anonymous function is feasible for the present case, there will be 
more complex problems where the definition of the ODE requires several lines of code. In 
such instances, creating a separate M-file is the only option. 


IMPROVEMENTS OF EULER’S METHOD 


A fundamental source of error in Euler’s method is that the derivative at the beginning of 
the interval is assumed to apply across the entire interval. Two simple modifications are 
available to help circumvent this shortcoming. As will be demonstrated in Sec. 22.4, both 
modifications (as well as Euler’s method itself) actually belong to a larger class of solu- 
tion techniques called Runge-Kutta methods. However, because they have very straight- 
forward graphical interpretations, we will present them prior to their formal derivation as 
Runge-Kutta methods. 
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22.3.1 Heun’s Method 


One method to improve the estimate of the slope involves the determination of two deriva- 
tives for the interval—one at the beginning and another at the end. The two derivatives 
are then averaged to obtain an improved estimate of the slope for the entire interval. This 
approach, called Heun’s method, is depicted graphically in Fig. 22.4. 

Recall that in Euler’s method, the slope at the beginning of an interval 


y; = ft, Yi) (22.14) 
is used to extrapolate linearly to y,, |: 
Yi =V + fj, yh (22.15) 


For the standard Euler method we would stop at this point. However, in Heun’ s method the 
Ya calculated in Eq. (22.15) is not the final answer, but an intermediate prediction. This 
is why we have distinguished it with a superscript 0. Equation (22.15) is called a predic- 
tor equation. It provides an estimate that allows the calculation of a slope at the end of the 
interval: 


Vier =F (list Year) (22.16) 
Thus, the two slopes [Eqs. (22.14) and (22.16)] can be combined to obtain an average slope 
for the interval: 

_, St YI +F (tia Yeu) 

A 2 
This average slope is then used to extrapolate linearly from y; to y,,, using Euler’s method: 
Ft. Yi) + F(t Yh) 

7 h 


Jarit (22.17) 

which is called a corrector equation. 

FIGURE 22.4 

Graphical depiction of Heun’s method. (a) Predictor and (b) corrector. 

v4 Slope = f (t1 Y?) Yi 
—7 
SI ftp ¥) + FCs Yea) 
Slope = f(t, y,) ope = 2 
f lea 7 ti ae 7 


(a) (6) 
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EXAMPLE 22.2 


; on die HED) +f(tsp vie) 
Vi $= Dar 2 h 


FIGURE 22.5 
Graphical representation of iterating the corrector of Heun’s method to obtain an improved 
estimate. 


The Heun method is a predictor-corrector approach. As just derived, it can be ex- 
pressed concisely as 


Predictor (Fig. 22.4a): y= y + f(t, yh (22.18) 


i+] 


ty") +l yi, 
Corrector (Fig. 22.4b): apes Flt Vin) 


f 7 h (22.19) 


(for j=1,2,...,m) 


Note that because Eq. (22.19) has y,,, on both sides of the equal sign, it can be applied in 
an iterative fashion as indicated. That is, an old estimate can be used repeatedly to provide 
an improved estimate of y,,,. The process is depicted in Fig. 22.5. 

As with similar iterative methods discussed in previous sections of the book, a termi- 
nation criterion for convergence of the corrector is provided by 
| aI Ri 


J 
i+] 


x 100% 


where yi and yi. , are the result from the prior and the present iteration of the corrector, 
respectively. It should be understood that the iterative process does not necessarily con- 
verge on the true answer but will converge on an estimate with a finite truncation error, as 
demonstrated in the following example. 


Heun’s Method 


Problem Statement. Use Heun’s method with iteration to integrate y’ = 4e°* — 0.5y from 


t = 0 to 4 with a step size of 1. The initial condition at t = 0 is y = 2. Employ a stopping 
criterion of 0.00001% to terminate the corrector iterations. 


Solution. First, the slope at (tọ, Yọ) is calculated as 
Y= 4e° — 0.5(2) = 3 
Then, the predictor is used to compute a value at 1.0: 


yo=243(1)=5 
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TABLE 22.2 Comparison of true and numerical values of the integral of y’ = 4e°*’ — 
0.5y, with the initial condition that y = 2 at t = 0. The numerical values 
were computed using the Euler and Heun methods with a step size of 1. 
The Heun method was implemented both without and with iteration of 
the corrector. 


Without Iteration With Iteration 
t JVirue YEuler je (%) Vuteun lE (%) Yuteun eq (%) 
2.00000 2.00000 2.00000 2.00000 


6.19463 5.00000 19.28 6.70108 8.18 6.36087 2.68 
14.84392 11.40216 2319 16:31978 9.94 15.30224 3.09 
33.67717 25.51321 24.24 37.19925 10.46 34.74328 3.17 
75.33896 56.84931 24.54 83.33777 10.62 77.73510 3.18 


AUNEBEO 


Note that this is the result that would be obtained by the standard Euler method. The true 
value in Table 22.2 shows that it corresponds to a percent relative error of 19.28%. 

Now, to improve the estimate for y,,, we use the value yy to predict the slope at the end 
of the interval 


yi = F(x, y?) = 4e°8 — 0.5(5) = 6.402164 


which can be combined with the initial slope to yield an average slope over the interval 
from t= 0 to 1: 


y=2+ oo ~ 4.701082 


This result can then be substituted into the corrector [Eq. (22.19)] to give the prediction at 
f=1: 


y! = 2 + 4.701082(1) = 6.701082 


which represents a true percent relative error of —8.18%. Thus, the Heun method without 
iteration of the corrector reduces the absolute value of the error by a factor of about 2.4 as 
compared with Euler’s method. At this point, we can also compute an approximate error as 


_ |6.701082 — 5 7 
leal = | oer gga | x 100% = 25.39% 


Now the estimate of y, can be refined by substituting the new result back into the right- 
hand side of Eq. (22.19) to give 


T 3 + 4e°*) — 0.5(6.701082) 


z 1 = 6.275811 


2 
y= 2 
which represents a true percent relative error of 1.31 percent and an approximate error of 


6.275811 — 6.701082 E 
6.275811 x 100% = 6.776% 


l€al = 
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The next iteration gives 


3 + 4e°8) — 0.5(6.275811) 


7 1 = 6.382129 


y =2+ 


which represents a true error of 3.03% and an approximate error of 1.666%. 

The approximate error will keep dropping as the iterative process converges on a sta- 
ble final result. In this example, after 12 iterations the approximate error falls below the 
stopping criterion. At this point, the result at t = 1 is 6.36087, which represents a true rela- 
tive error of 2.68%. Table 22.2 shows results for the remainder of the computation along 
with results for Euler’s method and for the Heun method without iteration of the corrector. 


Insight into the local error of the Heun method can be gained by recognizing that it 
is related to the trapezoidal rule. In the previous example, the derivative is a function of 
both the dependent variable y and the independent variable t. For cases such as polynomi- 
als, where the ODE is solely a function of the independent variable, the predictor step 
[Eq. (22.18)] is not required and the corrector is applied only once for each iteration. For 
such cases, the technique is expressed concisely as 


fi) + fod 
2 


Vint = Yi (22.20) 
Notice the similarity between the second term on the right-hand side of Eq. (22.20) and the 
trapezoidal rule [Eq. (19.11)]. The connection between the two methods can be formally 
demonstrated by starting with the ordinary differential equation 


dy _ 
ZO (22.21) 


This equation can be solved for y by integration: 


Yiri tizi 
[ dy= 1 S(O) dt (22.22) 
Yi i 
which yields 
biyi 
YaN =j) SO dt (22.23) 
or 
biyi 
Yin = Vit J SÒ dt (22.24) 


Now, recall that the trapezoidal rule [Eq. (19.11)] is defined as 


i FD + isd 
2 


fÒ dt = (22.25) 
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where h = ¢,,, — t; Substituting Eq. (22.25) into Eq. (22.24) yields 


f: f: 
Yai =Y + F) ara w y (22.26) 


which is equivalent to Eq. (22.20). For this reason, Heun’s method is sometimes referred 
to as the trapezoidal rule. 

Because Eq. (22.26) is a direct expression of the trapezoidal rule, the local truncation 
error is given by [recall Eq. (19.14)] 


TEF 


, a (22.27) 


where € is between f; and f,,,. Thus, the method is second order because the second de- 
rivative of the ODE is zero when the true solution is a quadratic. In addition, the local 
and global errors are O(h*) and O(h’), respectively. Therefore, decreasing the step size 
decreases the error at a faster rate than for Euler’s method. 


22.3.2 The Midpoint Method 


Figure 22.6 illustrates another simple modification of Euler’s method. Called the midpoint 
method, this technique uses Euler’s method to predict a value of y at the midpoint of the 
interval (Fig. 22.6a): 


Yin = Yi + ft, yi) (22.28) 
Then, this predicted value is used to calculate a slope at the midpoint: 


Yain = SF (tint 2» Yin1/2) (22.29) 


FIGURE 22.6 
Graphical depiction of midpoint method. (a) Predictor and (b) corrector. 


Y4 Ya 


Slope = f(lis1/2. Yi+1/2) 


Slope = (fisy Yie1/2) 


~T 
~Y 


(b) 
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which is assumed to represent a valid approximation of the average slope for the entire 
interval. This slope is then used to extrapolate linearly from t; to tı (Fig. 22.6): 


Vier = Vit flt Vier) (22.30) 


Observe that because y,,, is not on both sides, the corrector [Eq. (22.30)] cannot be applied 
iteratively to improve the solution as was done with Heun’s method. 

As in our discussion of Heun’s method, the midpoint method can also be linked to 
Newton-Cotes integration formulas. Recall from Table 19.4 that the simplest Newton-Cotes 
open integration formula, which is called the midpoint method, can be represented as 


b 
I f (x) dx = (b — a) f(x) (22.31) 


where x, is the midpoint of the interval (a, b). Using the nomenclature for the present case, 
it can be expressed as 


| FO dt = hf (tis) (22.32) 


Substitution of this formula into Eq. (22.24) yields Eq. (22.30). Thus, just as the Heun 
method can be called the trapezoidal rule, the midpoint method gets its name from the 
underlying integration formula on which it is based. 

The midpoint method is superior to Euler’s method because it utilizes a slope estimate 
at the midpoint of the prediction interval. Recall from our discussion of numerical differ- 
entiation in Sec. 4.3.4 that centered finite differences are better approximations of deriva- 
tives than either forward or backward versions. In the same sense, a centered approximation 
such as Eq. (22.29) has a local truncation error of O(h’) in comparison with the forward 
approximation of Euler’s method, which has an error of O(h). Consequently, the local and 
global errors of the midpoint method are O(h*) and O(h’), respectively. 


RUNGE-KUTTA METHODS 


Runge-Kutta (RK) methods achieve the accuracy of a Taylor series approach without 
requiring the calculation of higher derivatives. Many variations exist but all can be cast in 
the generalized form of Eq. (22.4): 


Viner =), + $h (22.33) 


where ¢ is called an increment function, which can be interpreted as a representative slope 
over the interval. The increment function can be written in general form as 


p= ak, + ak, +++ +a,k (24:34) 


n n 


where the a’s are constants and the k’s are 


ki = f(t; yi) (22.34a) 
k, = ft, + pih, yi + qııkıh) (22.34b) 
k, = J $ Ph, y+ daikh + Aa9kyh) (22.34c) 


ka = F(t; + Paih Yi + 4n-11K1h + An—12koh Pee Gig papal) (22.34d) 
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where the p’s and q’s are constants. Notice that the k’s are recurrence relationships. That is, 
k; appears in the equation for k, which appears in the equation for k, and so forth. Because 
each k is a functional evaluation, this recurrence makes RK methods efficient for computer 
calculations. 

Various types of RK methods can be devised by employing different numbers of terms 
in the increment function as specified by n. Note that the first-order RK method with 
n= l is, in fact, Euler’s method. Once n is chosen, values for the a’s, p’s, and q’s are evalu- 
ated by setting Eq. (22.33) equal to terms in a Taylor series expansion. Thus, at least for the 
lower-order versions, the number of terms n usually represents the order of the approach. 
For example, in Sec. 22.4.1, second-order RK methods use an increment function with two 
terms (n = 2). These second-order methods will be exact if the solution to the differential 
equation is quadratic. In addition, because terms with h° and higher are dropped during the 
derivation, the local truncation error is O(h°) and the global error is O(h’). In Sec. 22.4.2, 
the fourth-order RK method (n = 4) is presented for which the global truncation error 
is O(h*). 


22.4.1 Second-Order Runge-Kutta Methods 
The second-order version of Eq. (22.33) is 


Vier = Yi + (akı + ayky)h (22.35) 
where 

ki = fs yi) (22.35) 

k, = f(t; + pih, y; + iki) (22.35b) 


The values for a}, a, p,, and q; are evaluated by setting Eq. (22.35) equal to a 
second-order Taylor series. By doing this, three equations can be derived to evaluate the 
four unknown constants (see Chapra and Canale, 2010, for details). The three equations are 


a,+a,=1 (22.36) 
dy p, = 1/2 (22.37) 
aq = 1/2 (22.38) 


Because we have three equations with four unknowns, these equations are said to be 
underdetermined. We, therefore, must assume a value of one of the unknowns to determine 
the other three. Suppose that we specify a value for a,. Then Eqs. (22.36) through (22.38) 
can be solved simultaneously for 


a,=l1-a, (22.39) 


1 
P= oe (22.40) 


Because we can choose an infinite number of values for a,, there are an infinite number 
of second-order RK methods. Every version would yield exactly the same results if the 
solution to the ODE were quadratic, linear, or a constant. However, they yield different 
results when (as is typically the case) the solution is more complicated. Three of the most 
commonly used and preferred versions are presented next. 
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Heun Method without Iteration (a, = 1/2). If a, is assumed to be 1/2, Eqs. (22.39) and 
(22.40) can be solved for a, = 1/2 and p, = q,; = 1. These parameters, when substituted 
into Eq. (22.35), yield 


1 1 
Yar = Yit (5 ky +45) h (22.41) 
where 
ki = f(t; y;) (22.4 1a) 
k, = f(t, + h, y; + kh) (22.41b) 


Note that k, is the slope at the beginning of the interval and k, is the slope at the end of the 
interval. Consequently, this second-order RK method is actually Heun’s technique without 
iteration of the corrector. 


The Midpoint Method (a, = 1). If a, is assumed to be 1, then a, = 0, p, = qı, = 1/2, and 
Eq. (22.35) becomes 


Viner =), + kh (22.42) 
where 

ki = f(t; yi) (22.42a) 

ky = f(t, + h/2, y; + kh/2) (22.42b) 


This is the midpoint method. 


Ralston’s Method (a, = 3/4). Ralston (1962) and Ralston and Rabinowitz (1978) 
determined that choosing a, = 3/4 provides a minimum bound on the truncation error 
for the second-order RK algorithms. For this version, a, = 1/4 and p, = q,, = 2/3, and 
Eq. (22.35) becomes 


Yai =Y; + (4 k+ 2 ka) h (22.43) 
where 

ki = f(t, y;) (22.43a) 

k=f ( ti + 2 h, y; + 2 kn) (22.43b) 


22.4.2 Classical Fourth-Order Runge-Kutta Method 


The most popular RK methods are fourth order. As with the second-order approaches, 
there are an infinite number of versions. The following is the most commonly used form, 
and we therefore call it the classical fourth-order RK method: 


Yar = Yit l (ki + 2k, + 2k, + k,)h (22.44) 
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FIGURE 22.7 
Graphical depiction of the slope estimates comprising the fourth-order RK method. 


where 
k, = ft, y) (22.442) 
2 1 1 
k= f(1,+ shyt 5 kh) (22.44b) 
= 1 d. ; 
k= S(t abuts k-h) (22.440) 
k= f(t; + h, y; + kh) (22.44d) 


Notice that for ODEs that are a function of t alone, the classical fourth-order RK 
method is similar to Simpson’s 1/3 rule. In addition, the fourth-order RK method is simi- 
lar to the Heun approach in that multiple estimates of the slope are developed to come up 
with an improved average slope for the interval. As depicted in Fig. 22.7, each of the k’s 
represents a slope. Equation (22.44) then represents a weighted average of these to arrive 
at the improved slope. 


EXAMPLE 22.3 Classical Fourth-Order RK Method 


Problem Statement. Employ the classical fourth-order RK method to integrate y’ = 
4e” — 0.5y from t = 0 to 1 using a step size of 1 with y(0) = 2. 


Solution. For this case, the slope at the beginning of the interval is computed as 


k, = f0, 2) = 4e% — 0.5(2) =3 
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This value is used to compute a value of y and a slope at the midpoint: 


y(0.5) = 2 + 3(0.5) = 3.5 
k, = f(0.5, 3.5) = 4e°%© — 0,5(3.5) = 4.217299 


This slope in turn is used to compute another value of y and another slope at the midpoint: 


y(0.5) = 2 + 4.217299(0.5) = 4.108649 
k, = f(0.5, 4.108649) = 4e°85) — 0.5(4.108649) = 3.912974 


Next, this slope is used to compute a value of y and a slope at the end of the interval: 


y(1.0) = 2 + 3.912974(1.0) = 5.912974 
k, = f(1.0, 5.912974) = 4e°8U- — 0.5(5.912974) = 5.945677 


Finally, the four slope estimates are combined to yield an average slope. This average slope 
is then used to make the final prediction at the end of the interval. 


p= I [3 + 2(4.217299) + 2(3.912974) + 5.945677] = 4.201037 


y(1.0) = 2 + 4.201037(1.0) = 6.201037 


which compares favorably with the true solution of 6.194631 (e, = 0.103%). 


It is certainly possible to develop fifth- and higher-order RK methods. For example, 
Butcher’s (1964) fifth-order RK method is written as 


Yai =y + $ (Tk, + 32k, + 12k, + 32k; + 7ke)h (22.45) 
where 

ki = ft, y;) (22.45a) 
— 1 1 

k= f(1,+ ghyitg kh) (22.45b) 
_ 1 1 1 

k = f(t tahytgkhts kh) (22.450) 

kas (+ 5h -Å kh + ksh) (22.45d) 

4= irg > Yi 7 2 3 : 

_ 3 3 9 

ks =f (1+ th yt ae kh + kh) (22.45) 
_ 8 2 12,,_ 12 8 

ke= f(t +h y—Fhyh t+ Shh +72 ksh - kh +8 ksh) (22.45f) 


Note the similarity between Butcher’ s method and Boole’s rule in Table 19.2. As expected, 
this method has a global truncation error of O(h’). 

Although the fifth-order version provides more accuracy, notice that six function evalu- 
ations are required. Recall that up through the fourth-order versions, n function evaluations 
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EXAMPLE 22.4 


are required for an nth-order RK method. Interestingly, for orders higher than four, one or 
two additional function evaluations are necessary. Because the function evaluations account 
for the most computation time, methods of order five and higher are usually considered 
relatively less efficient than the fourth-order versions. This is one of the main reasons for 
the popularity of the fourth-order RK method. 


SYSTEMS OF EQUATIONS 


Many practical problems in engineering and science require the solution of a system of 
simultaneous ordinary differential equations rather than a single equation. Such systems 
may be represented generally as 


dy 
p= Ai Vv Yo cee Yn) 


dy 
P = f(t, Yı Yv... sYa) 


t (22.46) 


dy, 
a Sat, Vis Yo -> -> Vn) 


The solution of such a system requires that n initial conditions be known at the starting 
value of t. 

An example is the calculation of the bungee jumper’s velocity and position that we 
set up at the beginning of this chapter. For the free-fall portion of the jump, this problem 
amounts to solving the following system of ODEs: 


Bay (22.47) 
Cc 2 
dy =g- y? (22.48) 


If the stationary platform from which the jumper launches is defined as x = 0, the initial 
conditions would be x(0) = v(0) = 0. 


22.5.1 Euler’s Method 


All the methods discussed in this chapter for single equations can be extended to systems 
of ODEs. Engineering applications can involve thousands of simultaneous equations. In 
each case, the procedure for solving a system of equations simply involves applying the 
one-step technique for every equation at each step before proceeding to the next step. This 
is best illustrated by the following example for Euler’s method. 


Solving Systems of ODEs with Euler’s Method 


Problem Statement. Solve for the velocity and position of the free-falling bungee jumper 
using Euler’s method. Assuming that at t = 0, x = v = 0, and integrate to t = 10 s with a step 
size of 2 s. As was done previously in Examples 1.1 and 1.2, the gravitational acceleration 
is 9.81 m/s”, and the jumper has a mass of 68.1 kg with a drag coefficient of 0.25 kg/m. 
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Recall that the analytical solution for velocity is [Eq. (1.9)]: 


v(t) = a tanh qy s) 


This result can be substituted into Eq. (22.47) which can be integrated to determine an 
analytical solution for distance as 


cosh ly a | 


Use these analytical solutions to compute the true relative errors of the results. 


x(t) = a In 


Solution. The ODEs can be used to compute the slopes at t = 0 as 


dx _ 

dt = 

dv _ _ 9.25 2 _ 

er = 9.81 631 (0) = 9.81 


Euler’s method is then used to compute the values at t = 2 s, 

x=0+ 0(2) =0 

v = 0 + 9.81(2) = 19.62 
The analytical solutions can be computed as x(2) = 19.16629 and v (2) = 18.72919. Thus, 
the percent relative errors are 100% and 4.756%, respectively. 

The process can be repeated to compute the results at t = 4 as 

x = 0 + 19.62(2) = 39.24 

v = 19.62 + (9.81 = 9.25 (19.62?) 2 = 36.41368 


Proceeding in a like manner gives the results displayed in Table 22.3. 


TABLE 22.3 Distance and velocity of a free-falling bungee jumper as computed 
numerically with Euler’s method. 


t “true Verue *Euler Veuter E (x) E, (v) 
0 0 0 0 0 

2 19.1663 18.7292 0 19.6200 100.00% 4.76% 
4 71.9304 33.1118 39.2400 36.4137 45.45% 9.97% 
6 147.9462 42.0762 112.0674 46.2983 24.25% 10.03% 
8 237.5104 46.9575 204.6640 50.1802 13.83% 6.86% 
10 334.1782 49.4214 305.0244 51.3123 8.72% 3.83% 
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Although the foregoing example illustrates how Euler’s method can be implemented 
for systems of ODEs, the results are not very accurate because of the large step size. In 
addition, the results for distance are a bit unsatisfying because x does not change until 
the second iteration. Using a much smaller step greatly mitigates these deficiencies. As 
described next, using a higher-order solver provides decent results even with a relatively 
large step size. 


22.5.2 Runge-Kutta Methods 


Note that any of the higher-order RK methods in this chapter can be applied to systems of 
equations. However, care must be taken in determining the slopes. Figure 22.7 is helpful in 
visualizing the proper way to do this for the fourth-order method. That is, we first develop 
slopes for all variables at the initial value. These slopes (a set of k,’s) are then used to 
make predictions of the dependent variable at the midpoint of the interval. These midpoint 
values are in turn used to compute a set of slopes at the midpoint (the k,’s). These new 
slopes are then taken back to the starting point to make another set of midpoint predictions 
that lead to new slope predictions at the midpoint (the k,’s). These are then employed to 
make predictions at the end of the interval that are used to develop slopes at the end of the 
interval (the k,’s). Finally, the k’s are combined into a set of increment functions [as in 
Eq. (22.44)] that are brought back to the beginning to make the final predictions. The fol- 
lowing example illustrates the approach. 


Solving Systems of ODEs with the Fourth-Order RK Method 


Problem Statement. Use the fourth-order RK method to solve for the same problem we 
addressed in Example 22.4. 


Solution. First, it is convenient to express the ODEs in the functional format of 
Eq. (22.46) as 
dx 


P7 = fit. x, v) =v 
dv 


C 
P = f(t, x, v) = g — jio 


dt 
The first step in obtaining the solution is to solve for all the slopes at the beginning of the 
interval: 

k,, =f, 0, 0, 0) =0 

ki» = f0, 0, 0) = 9.81 — 2:22.(0)? = 9.81 

68.1 

where k, ; is the ith value of k for the jth dependent variable. Next, we must calculate the 
first values of x and v at the midpoint of the first step: 


h 
1a 2 


v(1) = v(0) + kot =0+9.812=9.81 


x(1)=x(0)+k =0+02=0 
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which can be used to compute the first set of midpoint slopes: 
k,, = f,C, 0, 9.81) = 9.8100 
ka> = fo, 0, 9.81) = 9.4567 


These are used to determine the second set of midpoint predictions: 
x(1) =x(0) + bat =0+ 9.81002 = 9.8100 


v(1) = 00) + kayf 


which can be used to compute the second set of midpoint slopes: 
k,, = f,(1, 9.8100, 9.4567) = 9.4567 
kz = fo, 9.8100, 9.4567) = 9.4817 


=0+ 9.45675 = 9.4567 


These are used to determine the predictions at the end of the interval: 
x(2) = x (0) + k; h = 0 + 9.4567(2) = 18.9134 
v (2) = v (0) + kz h = 0 + 9.4817(2) = 18.9634 

which can be used to compute the endpoint slopes: 


ky = f,(2, 18.9134, 18.9634) = 18.9634 
kyo = fy(2, 18.9134, 18.9634) = 8.4898 


The values of k can then be used to compute [Eq. (22.44)]: 

x(2)=0+ ral + 2(9.8100 + 9.4567) + 18.9634] 2 = 19.1656 

v(2)=0+ l [9.8100 + 2(9.4567 + 9.4817) + 8.4898] 2 = 18.7256 

Proceeding in a like manner for the remaining steps yields the values displayed in 
Table 22.4. In contrast to the results obtained with Euler’s method, the fourth-order RK 


predictions are much closer to the true values. Further, a highly accurate, nonzero value is 
computed for distance on the first step. 


TABLE 22.4 Distance and velocity of a free-falling bungee jumper as computed 
numerically with the fourth-order RK method. 


t “true Verue *RkK4 URK4 E, (x) £, (v) 
0 0 0 0 0 

2 19.1663 18.7292 19.1656 18.7256 0.004% 0.019% 
4 71.9304 33.1118 71.9311 33.0995 0.001% 0.037% 
6 147.9462 42.0762 147.9521 42.0547 0.004% 0.051% 
8 237.5104 46.9575 237.5104 46.9345 0.000% 0.049% 
10 334.1782 49.4214 334.1626 49.4027 0.005% 0.038% 
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22.5.3 MATLAB M-file Function: rk4sys 


Figure 22.8 shows an M-file called rk4sys that uses the fourth-order RK method to solve a sys- 
tem of ODEs. This code is similar in many ways to the function developed earlier (Fig. 22.3) to 
solve a single ODE with Euler’s method. For example, it is passed the function name defining 
the ODEs through its argument. 


FIGURE 22.8 
An M-file to implement the RK4 method for a system of ODEs. 


function [tp,yp] = rk4sys(dydt,tspan,y0,h,varargin) 

% rk4sys: fourth-order Runge-Kutta for a system of ODEs 
%  [t,y] = rk4sys(dydt,tspan,y0,h,p1,p2,...): integrates 
% a system of ODEs with fourth-order RK method 
% input: 

%  dydt = name of the M-file that evaluates the ODEs 

% tspan = [ti, tf]; initial and final times with output 
% generated at interval of h, or 

% = [t0 t1 ... tf]; specific times where solution output 
% yO = initial values of dependent variables 

% h= step size 

% pl,p2,... = additional parameters used by dydt 

% output: 

% tp = vector of independent variable 

% yp = vector of solution for dependent variables 


if nargin<4,error('at least 4 input arguments required'), end 
if any(diff(tspan)<=0),error('tspan not ascending order'), end 
n = length(tspan) ; 
ti = tspan(1);tf = tspan(n); 
if n == 
t= (ti:h:tf)'; n = length(t); 
if t(n)<tf 
t(n+1) = tf; 
n = n+1; 
end 
else 
t = tspan; 
end 
e Sele Gls) = yor 
ae Ab Yaa = wes MOLTO, = MLS) 
j=l 
while(1) 
tend = t(np+1); 
hh = t(np+1) - t(np); 
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if hh>h,hh = h;end 

whi le(1) 
if ttthh>tend,hh = tend-tt;end 
k1 = dydt(tt,y(i,:),varargin{:})'; 
ymid = y(i,:) + k1.*hh./2; 
k2 = dydt(tt+hh/2,ymid,varargin{:})'; 
ymid = y(i,:) + k2*hh/2; 
k3 = dydt(tt+hh/2,ymid,varargin{:})'; 
yend = y(i,:) + k3*hh; 
k4 = dydt(tt+hh,yend,varargin{:})'; 
phi = (k1+2*(k2+k3)+k4)/6; 
y(i+1,:) = y(i,:) + phi*hh; 
tt = ttthh; 
Teele 
if tt>=tend,break,end 

end 

np = np+1; tp(np) = tt; yp(np,:) = y(i,:); 

if tt>=tf,break,end 

end 


FIGURE 22.8 (Continued) 


However, it has an additional feature that allows you to generate output in two ways, 
depending on how the input variable tspan is specified. As was the case for Fig. 22.3, you 
can set tspan = [ti tf], where ti and tf are the initial and final times, respectively. If done 
in this way, the routine automatically generates output values between these limits at equal 
spaced intervals h. Alternatively, if you want to obtain results at specific times, you can 
define tspan = [t0,t1,...,tf]. Note that in both cases, the tspan values must be in ascend- 
ing order. 

We can employ rk4sys to solve the same problem as in Example 22.5. First, we can 
develop an M-file to hold the ODEs: 


function dy = dydtsys(t, y) 
dy = [y(2);9.81-0.25/68.1*y(2)42]; 


where y(1) = distance (x) and y(2) = velocity (v). The solution can then be generated as 


>> [t y] = rk4sys(@dydtsys,[0 10],[0 0],2); 
>> disp([t' y(:,1) y(:,2)]) 


0 0 0 
2.0000 19.1656 18.7256 
4.0000 71.9311 33.0995 
6.0000 147.9521 42.0547 
8.0000 237.5104 46.9345 

10.0000 334.1626 49.4027 
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We can also use tspan to generate results at specific values of the independent variable. 
For example, 


>> tspan=[0 6 10]; 
>> [t y] = rk4sys(@dydtsys,tspan,[0 0],2); 
>> disp([t' y(:,1) y(:,2)]) 


0 0 0 
6.0000 147.9521 42.0547 
10.0000 334.1626 49.4027 


ye eer NDE PREDATOR-PREY MODELS AND CHAOS 


Background. Engineers and scientists deal with a variety of problems involving systems 
of nonlinear ordinary differential equations. This case study focuses on two of these ap- 
plications. The first relates to predator-prey models that are used to study species interac- 
tions. The second are equations derived from fluid dynamics that are used to simulate the 
atmosphere. 

Predator-prey models were developed independently in the early part of the twenti- 
eth century by the Italian mathematician Vito Volterra and the American biologist Alfred 
Lotka. These equations are commonly called Lotka-Volterra equations. The simplest ver- 
sion is the following pairs of ODEs: 


a 

F ax — bxy (22.49) 
dy 

ae = —cy + dxy (22.50) 


where x and y = the number of prey and predators, respectively, a = the prey growth rate, 
c = the predator death rate, and b and d = the rates characterizing the effect of the predator- 
prey interactions on the prey death and the predator growth, respectively. The multiplica- 
tive terms (i.e., those involving xy) are what make such equations nonlinear. 

An example of a simple nonlinear model based on atmospheric fluid dynamics is the 
Lorenz equations created by the American meteorologist Edward Lorenz: 


i 

ig ox + oy 
dy 

pe ae ee 
Oe 

di bz + xy 


Lorenz developed these equations to relate the intensity of atmospheric fluid motion x 
to temperature variations y and z in the horizontal and vertical directions, respectively. 
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22.6 CASE STUDY continued 


As with the predator-prey model, the nonlinearities stem from the simple multiplicative 
terms: xz and xy. 

Use numerical methods to obtain solutions for these equations. Plot the results to 
visualize how the dependent variables change temporally. In addition, graph the dependent 
variables versus each other to see whether any interesting patterns emerge. 


Solution. The following parameter values can be used for the predator-prey simulation: 
a= 1.2,b=0.6, c = 0.8, and d = 0.3. Employ initial conditions of x = 2 and y = 1 and 
integrate from t = 0 to 30, using a step size of h = 0.0625. 

First, we can develop a function to hold the differential equations: 


function yp = predprey(t,y,a,b,c,d) 
yp = [a*y(1)-b*y(1)*y(2); -c*y(2)+d*y(1)*y(2)]; 


The following script employs this function to generate solutions with both the Euler 
and the fourth-order RK methods. Note that the function eulersys was based on modify- 
ing the rk4sys function (Fig. 22.8). We will leave the development of such an M-file as 
a homework problem. In addition to displaying the solution as a time-series plot (x and y 
versus f), the script also generates a plot of y versus x. Such phase-plane plots are often 
useful in elucidating features of the model’s underlying structure that may not be evident 
from the time series. 


h=0.0625;tspan=[0 40];yO=[2 1]; 
a=1.2;b=0.6;c=0.8;d=0.3; 

[t y] = eulersys(@predprey,tspan,y0,h,a,b,c,d); 
SMa lene 2 1) cryonics dL) se We 22 )), =o") 
legend('prey','predator');title('(a) Euler time plot“) 
SUDPNG E222) POEET NET) 

title('(b) Euler phase plane plot') 

[t y] = rk4sys(@predprey,tspan,y0,h,a,b,c,d); 
SUDPIOECZ 22) ejolloic(ie ws ally) pels 2), ==") 
title('(c) RK4 time plot') 
subplot(2,2,4);plot(y(:,1),y(:,2)) 

title('(d) RK4 phase plane plot') 


The solution obtained with Euler’s method is shown at the top of Fig. 22.9. The time 
series (Fig. 22.9a) indicates that the amplitudes of the oscillations are expanding. This is 
reinforced by the phase-plane plot (Fig. 22.9b). Hence, these results indicate that the crude 
Euler method would require a much smaller time step to obtain accurate results. 

In contrast, because of its much smaller truncation error, the RK4 method yields 
good results with the same time step. As in Fig. 22.9c, a cyclical pattern emerges in time. 
Because the predator population is initially small, the prey grows exponentially. At a 
certain point, the prey become so numerous that the predator population begins to grow. 
Eventually, the increased predators cause the prey to decline. This decrease, in turn, leads 
to a decrease of the predators. Eventually, the process repeats. Notice that, as expected, the 
predator peak lags the prey. Also, observe that the process has a fixed period—that is, it 
repeats in a set time. 
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22.6 CASE STUDY continued 


15 8 
—— prey 
--- predator 6 7 
10 
4 4 
5 
2) 4 
(0) (0) 
(0) 5 10 15 
(a) Euler time plot (b) Euler phase plane plot 
6 4 
3 = 4 
2 = — 
1 = 4 
o | | | o | | 
o 10 20 30 40 (0) 2 4 6 
(c) RK4 time plot (d) RK4 phase plane plot 


FIGURE 22.9 
Solution for the Lotka-Volterra model. Euler’s method (a) time-series and (b) phase-plane 
plots, and RK4 method (c) time-series and (d) phase-plane plots. 


The phase-plane representation for the accurate RK4 solution (Fig. 22.9d) indi- 
cates that the interaction between the predator and the prey amounts to a closed coun- 
terclockwise orbit. Interestingly, there is a resting or critical point at the center of the 
orbit. The exact location of this point can be determined by setting Eqs. (22.49) and 
(22.50) to steady state (dy/dt = dx/dt = 0) and solving for (x, y) = (0, 0) and (c/d, a/b). 
The former is the trivial result that if we start with neither predators nor prey, nothing 
will happen. The latter is the more interesting outcome that if the initial conditions are 
set at x = c/d and y = a/b, the derivatives will be zero, and the populations will remain 
constant. 

Now, let’s use the same approach to investigate the trajectories of the Lorenz equations 
with the following parameter values: a = 10, b = 8/3, and r = 28. Employ initial conditions 
of x = y = z = 5 and integrate from t = 0 to 20. For this case, we will use the fourth-order 
RK method to obtain solutions with a constant time step of h = 0.03125. 
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22.6 CASE STUDY continued 


20 


Lorenz model x versus t 


—Z=y=2=5 
=== y= 5.00], y=z=5 


FIGURE 22.10 

Time-domain representation of x versus t for the Lorenz equations. The solid time series is 
for the initial conditions (5, 5, 5). The dashed line is where the initial condition for x is 
perturbed slightly (5.001, 5, 5). 


The results are quite different from the behavior of the Lotka-Volterra equations. As 
in Fig. 22.10, the variable x seems to be undergoing an almost random pattern of oscilla- 
tions, bouncing around from negative values to positive values. The other variables exhibit 
similar behavior. However, even though the patterns seem random, the frequency of the 
oscillation and the amplitudes seem fairly consistent. 

An interesting feature of such solutions can be illustrated by changing the initial 
condition for x slightly (from 5 to 5.001). The results are superimposed as the dashed 
line in Fig. 22.10. Although the solutions track on each other for a time, after about 
t = 15 they diverge significantly. Thus, we can see that the Lorenz equations are quite 
sensitive to their initial conditions. The term chaotic is used to describe such solutions. 
In his original study, this led Lorenz to the conclusion that long-range weather forecasts 
might be impossible! 

The sensitivity of a dynamical system to small perturbations of its initial conditions 
is sometimes called the butterfly effect. The idea is that the flapping of a butterfly’s wings 
might induce tiny changes in the atmosphere that ultimately leads to a large-scale weather 
phenomenon like a tornado. 
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22.6 CASE STUDY continued 


(a) y versus x (b) z versus x (c) z versus y 


30 


FIGURE 22.11 


Phase-plane representation for the Lorenz equations. (a) xy, (b) xz, and (c) yz projections. 


Although the time-series plots are chaotic, phase-plane plots reveal an underly- 
ing structure. Because we are dealing with three independent variables, we can generate 
projections. Figure 22.11 shows projections in the xy, xz, and the yz planes. Notice how a 
structure is manifest when perceived from the phase-plane perspective. The solution forms 
orbits around what appear to be critical points. These points are called strange attractors in 
the jargon of mathematicians who study such nonlinear systems. 

Beyond the two-variable projections, MATLAB’s plot3 function provides a vehicle to 
directly generate a three-dimensional phase-plane plot: 


>> plot3(y(:,1),y(:,2),y(:,3)) 
>> xlabel('x');ylabel('y');zlabel('z');grid 


As was the case for Fig. 22.11, the three-dimensional plot (Fig 22.12) depicts trajectories 
cycling in a definite pattern around a pair of critical points. 

As a final note, the sensitivity of chaotic systems to initial conditions has implications 
for numerical computations. Beyond the initial conditions themselves, different step sizes 
or different algorithms (and in some cases, even different computers) can introduce small 
differences in the solutions. In a similar fashion to Fig. 22.10, these discrepancies will 
eventually lead to large deviations. Some of the problems in this chapter and in Chap. 23 
are designed to demonstrate this issue. 
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22.6 CASE STUDY continued 


—50 


FIGURE 22.12 


—20 


Three-dimensional phase-plane representation for the Lorenz equations generated with 


MATLAB’s plot3 function. 


PROBLEMS 


22.1 Solve the following initial value problem over the 
interval from ¢ = 0 to where y(0) = 1. Display all your results 
on the same graph. 


(a) Analytically. 

(b) Using Euler’s method with h = 0.5 and 0.25. 

(c) Using the midpoint method with h = 0.5. 

(d) Using the fourth-order RK method with h = 0.5. 


22.2 Solve the following problem over the interval from 
x = 0 to 1 using a step size of 0.25 where y(0) = 1. Display 
all your results on the same graph. 

dy 


a= (lt WF 


(a) Analytically. 

(b) Using Euler’s method. 

(c) Using Heun’s method without iteration. 
(d) Using Ralston’s method. 

(e) Using the fourth-order RK method. 
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22.3 Solve the following problem over the interval from 
t = 0 to 3 using a step size of 0.5 where y(0) = 1. Display all 
your results on the same graph. 

a =-y+?? 

Obtain your solutions with (a) Heun’s method without 
iterating the corrector, (b) Heun’s method with iterating 
the corrector until e, < 0.1%, (c) the midpoint method, and 
(d) Ralston’s method. 

22.4 The growth of populations of organisms has many 
engineering and scientific applications. One of the simplest 
models assumes that the rate of change of the population p is 
proportional to the existing population at any time t: 


dp_, 


y = kep (P22.4.1) 


where k, = the growth rate. The world population in millions 
from 1950 through 2000 was 


t 1950 1955 1960 1965 1970 1975 
Pp 2555 2780 3040 3346 3708 4087 
t 1980 1985 1990 1995 2000 
Pp 4454 4850 5276 5686 6079 


(a) Assuming that Eq. (P22.4.1) holds, use the data from 
1950 through 1970 to estimate k,. 

(b) Use the fourth-order RK method along with the results 
of (a) to stimulate the world population from 1950 to 
2050 with a step size of 5 years. Display your simulation 
results along with the data on a plot. 

22.5 Although the model in Prob. 22.4 works adequately 

when population growth is unlimited, it breaks down when 

factors such as food shortages, pollution, and lack of space 
inhibit growth. In such cases, the growth rate is not a con- 
stant, but can be formulated as 


k, = Kan( 1 > P/ Prax) 


where k,,, = the maximum growth rate under unlimited con- 
ditions, p = population, and P na; = the maximum population. 
Note that Pmax iS sometimes called the carrying capacity. 
Thus, at low population density Pp & Pinaxs Ky > Kem AS p 
approaches Pmax the growth rate approaches zero. Using this 
growth rate formulation, the rate of change of population 
can be modeled as 


dp 
Kem! Z P/Prnax)P 


This is referred to as the logistic model. The analytical solu- 
tion to this model is 


Pimax 
ATE Po + (Pmax — Po) ™ 

Simulate the world’s population from 1950 to 2050 using 
(a) the analytical solution and (b) the fourth-order RK 
method with a step size of 5 years. Employ the following 
initial conditions and parameter values for your simulation: 
Po Gn 1950) = 2555 million people, k,,, = 0.026/yr, and 
Pmax = 12,000 million people. Display your results as a plot 
along with the data from Prob. 22.4. 

22.6 Suppose that a projectile is launched upward from the 
earth’s surface. Assume that the only force acting on the ob- 
ject is the downward force of gravity. Under these condi- 
tions, a force balance can be used to derive 


R? 
(R +x? 


dv _ _ 
Tie g0) 


where v = upward velocity (m/s), t = time (s), x = al- 
titude (m) measured upward from the earth’s surface, 
g(0) = the gravitational acceleration at the earth’s sur- 
face (= 9.81 m/s), and R = the earth’s radius (= 6.37 x 
10° m). Recognizing that dx/dt = v, use Euler’s method to 
determine the maximum height that would be obtained if 
v(t = 0) = 1500 m/s. 

22.7 Solve the following pair of ODEs over the interval 
from t = 0 to 0.4 using a step size of 0.1. The initial condi- 
tions are y(0) = 2 and z(0) = 4. Obtain your solution with 
(a) Euler’s method and (b) the fourth-order RK method. Dis- 
play your results as a plot. 


dy _ a 
= 2y + 4e 
de _ Ye 

dt 3 


22.8 The van der Pol equation is a model of an electronic 
circuit that arose back in the days of vacuum tubes: 


dt? ~ y dt 


Given the initial conditions, y(0) = y’(0) = 1, solve this equa- 
tion from ¢ = 0 to 10 using Euler’s method with a step size 
of (a) 0.2 and (b) 0.1. Plot both solutions on the same graph. 
22.9 Given the initial conditions, y(0) = 1 and y’(O) = 0, 
solve the following initial-value problem from ż = 0 to 4: 
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Obtain your solutions with (a) Euler’s method and (b) the 
fourth-order RK method. In both cases, use a step size of 0.1. 
Plot both solutions on the same graph along with the exact 
solution y = cos 3t. 

22.10 Develop an M-file to solve a single ODE with Heun’s 
method with iteration. Design the M-file so that it creates a 
plot of the results. Test your program by using it to solve for 
population as described in Prob. 22.5. Employ a step size of 
5 years and iterate the corrector until e, < 0.1%. 

22.11 Develop an M-file to solve a single ODE with the 
midpoint method. Design the M-file so that it creates a plot 
of the results. Test your program by using it to solve for 
population as described in Prob. 22.5. Employ a step size of 
5 years. 

22.12 Develop an M-file to solve a single ODE with the 
fourth-order RK method. Design the M-file so that it creates 
a plot of the results. Test your program by using it to solve 
Prob. 22.2. Employ a step size of 0.1. 

22.13 Develop an M-file to solve a system of ODEs with 
Euler’s method. Design the M-file so that it creates a plot of 
the results. Test your program by using it to solve Prob. 22.7 
with a step size of 0.25. 

22.14 Isle Royale National Park is a 210-square-mile archi- 
pelago composed of a single large island and many small 
islands in Lake Superior. Moose arrived around 1900, and 
by 1930, their population approached 3000, ravaging veg- 
etation. In 1949, wolves crossed an ice bridge from Ontario. 
Since the late 1950s, the numbers of the moose and wolves 
have been tracked. 


FIGURE P22.15 


(a) Integrate the Lotka-Volterra equations (Sec. 22.6) from 
1960 through 2020 using the following coefficient val- 
ues: a = 0.23, b = 0.0133, c = 0.4, and d = 0.0004. Com- 
pare your simulation with the data using a time-series 
plot and determine the sum of the squares of the residu- 
als between your model and the data for both the moose 
and the wolves. 

(b) Develop a phase-plane plot of your solution. 

22.15 The motion of a damped spring-mass system 

(Fig. P22.15) is described by the following ordinary differ- 

ential equation: 

mS + ci + kx =0 

where x = displacement from equilibrium position (m), 

t = time (s), m = 20-kg mass, and c = the damping coef- 

ficient (N - s/m). The damping coefficient c takes on three 


Year Moose Wolves Year Moose Wolves Year Moose Wolves 
959 563 20 1975 1355 41 1991 1313 2 
960 610 22 1976 1282 44 1992 1590 2 
961 628 22 1977 1143 34 1993 1879 3 
962 639 23 1978 1001 40 1994 1770 17 
963 663 20 1979 1028 43 1995 2422 16 
964 707 26 1980 910 50 1996 1163 22 
965 733 28 1981 863 30 1997 500 24 
966 765 26 1982 872 14 1998 699 4 
967 912 22 1983 932 23 1999 750 25 
968 1042 22 1984 1038 24 2000 850 29 
969 1268 17 1985 1115 22 2001 900 19 
970 1295 18 1986 1192 20 2002 1100 17 
971 1439 20 1987 1268 16 2003 900 9 
972 1493 23 1988 1335 12 2004 750 29 
973 1435 24 1989 1397 12 2005 540 30 

1974 1467 31 1990 1216 15 2006 450 30 
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FIGURE P22.16 
A spherical tank. 


values of 5 (underdamped), 40 (critically damped), and 
200 (overdamped). The spring constant k = 20 N/m. The 
initial velocity is zero, and the initial displacement x = 1 m. 
Solve this equation using a numerical method over the 
time period 0 <t< 15 s. Plot the displacement versus time 
for each of the three values of the damping coefficient on 
the same plot. 

22.16 A spherical tank has a circular orifice in its bottom 
through which the liquid flows out (Fig. P22.16). The flow 
rate through the hole can be estimated as 


Oo = CA /2gh 


where Q u = outflow (m*/s), C = an empirically derived 
coefficient, A = the area of the orifice (m7), g = the gravi- 
tational constant (= 9.81 m/s*), and h = the depth of liquid 
in the tank. Use one of the numerical methods described in 
this chapter to determine how long it will take for the water 
to flow out of a 3-m diameter tank with an initial height of 
2.75 m. Note that the orifice has a diameter of 3 cm and 
C=0;55. 

22.17 In the investigation of a homicide or accidental death, 
it is often important to estimate the time of death. From the 
experimental observations, it is known that the surface tem- 
perature of an object changes at a rate proportional to the 
difference between the temperature of the object and that of 
the surrounding environment or ambient temperature. This 
is known as Newton’s law of cooling. Thus, if T(t) is the 
temperature of the object at time ¢, and T, is the constant 
ambient temperature: 


aT 


Ga TKI -T) 


where K > 0 is a constant of proportionality. Suppose that 
at time ¢ = 0 a corpse is discovered and its temperature is 
measured to be T,. We assume that at the time of death, 
the body temperature T, was at the normal value of 37 °C. 
Suppose that the temperature of the corpse when it was dis- 
covered was 29.5 °C, and that two hours later, it is 23.5 °C. 
The ambient temperature is 20 °C. 

(a) Determine K and the time of death. 

(b) Solve the ODE numerically and plot the results. 

22.18 The reaction A > B takes place in two reactors in se- 
ries. The reactors are well mixed but are not at steady state. 
The unsteady-state mass balance for each stirred tank reactor 
is shown below: 


S = 1 (CA, — CA) — kCA, 
Ai =—1 CB, +kCA, 
= = (CA, - CA) — kCA, 
a = L (CB, — CB, + kCA, 
where CA, = concentration of A at the inlet of the first 


reactor, CA, = concentration of A at the outlet of the first 
reactor (and inlet of the second), CA, = concentration of A at 
the outlet of the second reactor, CB, = concentration of B at 
the outlet of the first reactor (and inlet of the second), CB, = 
concentration of B in the second reactor, t = residence time 
for each reactor, and k = the rate constant for reaction of A 
to produce B. If CA, is equal to 20, find the concentrations 
of A and B in both reactors during their first 10 minutes of 
operation. Use k = 0.12/min and t = 5 min and assume that 
the initial conditions of all the dependent variables are zero. 
22.19 A nonisothermal batch reactor can be described by 
the following equations: 


dC _ eT C 
dt 


EE = 10006002C — 10(T — 20) 


where C is the concentration of the reactant and T is the tem- 
perature of the reactor. Initially, the reactor is at 15 °C and 
has a concentration of reactant C of 1.0 gmol/L. Find the 
concentration and temperature of the reactor as a function 
of time. 
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FIGURE P22.21 


22.20 The following equation can be used to model the de- 
flection of a sailboat mast subject to a wind force: 
dy _ fO 


JM? =e 2 
dz? 2EI ese 


where f(z) = wind force, E = modulus of elasticity, L = mast 
length, and 7 = moment of inertia. Note that the force varies 
with height according to 
— 200z .-2:/30 
f@) = S42 el 

Calculate the deflection if y = 0 and dy/dz = 0 at z = 0. Use 
parameter values of L = 30, E = 1.25 x 10°, and J = 0.05 for 
your computation. 

22.21 A pond drains through a pipe as shown in Fig. P22.21. 
Under a number of simplifying assumptions, the following 
differential equation describes how depth changes with time: 


dh nd? 

aE aay OPO 
where h = depth (m), t = time (s), d = pipe diameter (m), 
A(h) = pond surface area as a function of depth (m®, g = 
gravitational constant (= 9.81 m/s”), and e = depth of pipe 
outlet below the pond bottom (m). Based on the following 
area-depth table, solve this differential equation to deter- 
mine how long it takes for the pond to empty, given that 
h(0)=6m,d=0.25m,e=1m. 


h,m 6 5 4 3 2 T 0 
A(h), 104m? 1.17 0.97 0.67 0.45 0.32 0.18 0 


22.22 Engineers and scientists use mass-spring models to 
gain insight into the dynamics of structures under the influ- 
ence of disturbances such as earthquakes. Figure P22.22 
shows such a representation for a three-story building. For 
this case, the analysis is limited to horizontal motion of the 


F 
d 
m, = 8000 kg 
NN k3 = 1800 kN/m 
m, = 10,000 kg 
NN ky = 2400 kN/m 
m, = 12,000 kg 
LIED kı = 3000 kN/m 
SSeS SES 


FIGURE P22.22 


structure. Using Newton’s second law, force balances can be 
developed for this system as 


dx k k 

r =— a xy + a (x5 — x) 
dw b k, 

ae = m æi = X2) + m; œ = X>) 
ax, e 


Simulate the dynamics of this structure from ¢ = 0 to 20 s, 
given the initial condition that the velocity of the ground 
floor is dx,/dt = 1 m/s, and all other initial values of dis- 
placements and velocities are zero. Present your results as 
two time-series plots of (a) displacements and (b) velocities. 
In addition, develop a three-dimensional phase-plane plot of 
the displacements. 

22.23 Repeat the same simulations as in Sec. 22.6 for the 
Lorenz equations but generate the solutions with the mid- 
point method. 
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22.24 Perform the same simulations as in Sec. 22.6 for the 
Lorenz equations but use a value of r = 99.96. Compare your 
results with those obtained in Sec. 22.6. 

22.25 Figure P22.25 shows the kinetic interactions govern- 
ing the concentrations of a bacteria culture and their nutri- 
tion source (substrate) in a continuously stirred flow-through 
bioreactor. 


Substrate 


FIGURE P22.25 
Continuously stirred flow-through bioreactor to grow a 
bacterial culture. 


The mass balances for the bacteria biomass, X (gC/m°), and 
the substrate concentration, S (gC/m>), can be written as 


dX _ fy S 1 


C (Kemm ET hr 7] X 
dS_ 1 S -dls _ 
dt = Y Kois K, + 5% + kX Ty Sin S) 


where ¢ = time (h), k, mas = maximum bacterial growth 
rate (/d), K, = half-saturation constant (gC/m°), ky = death 
rate (/d), k, = respiration rate (h), Q = flow rate (m*/h), 
V = reactor volume (m°), Y = yield coefficient (¢C-cell/gC- 
substrate), and S,,, = inflow substrate concentration (mgC/m’*). 
Simulate how the substrate, bacteria, and total organic 
carbon (X + S) change over time in this reactor for three 
residence times: (a) t, = 20 h, (b) z,, = 10 h, and (c) z,,= 5h. 
Employ the following parameters for the simulation: X(0) = 
100 gC/m3, S(0) = 0, k, max = 9.2/hr, K, = 150 gC/m?, ky = 
k, =0.01/hr, Y = 0.5 gC-cell/gC-substrate, V = 0.01 mî, and 
S,,, = 1000 gC/m*, and display your results graphically. 


23.1 


Adaptive Methods 
and Stiff Systems 


CHAPTER OBJECTIVES 


The primary objective of this chapter is to introduce you to more advanced methods 
for solving initial-value problems for ordinary differential equations. Specific 
objectives and topics covered are 


e Understanding how the Runge-Kutta Fehlberg methods use RK methods of 
different orders to provide error estimates that are used to adjust the step size. 


Familiarizing yourself with the built-in MATLAB functions for solving ODEs. 
Learning how to adjust the options for MATLAB’s ODE solvers. 

Learning how to pass parameters to MATLAB’s ODE solvers. 

Understanding the difference between one-step and multistep methods for solving 
ODEs. 


Understanding what is meant by stiffness and its implications for solving ODEs. 


ADAPTIVE RUNGE-KUTTA METHODS 


To this point, we have presented methods for solving ODEs that employ a constant step size. 
For a significant number of problems, this can represent a serious limitation. For example, 
suppose that we are integrating an ODE with a solution of the type depicted in Fig. 23.1. 
For most of the range, the solution changes gradually. Such behavior suggests that a fairly 
large step size could be employed to obtain adequate results. However, for a localized 
region from f = 1.75 to 2.25, the solution undergoes an abrupt change. The practical con- 
sequence of dealing with such functions is that a very small step size would be required to 
accurately capture the impulsive behavior. If a constant step-size algorithm were employed, 
the smaller step size required for the region of abrupt change would have to be applied to 
the entire computation. As a consequence, a much smaller step size than necessary—and, 
therefore, many more calculations—would be wasted on the regions of gradual change. 
Algorithms that automatically adjust the step size can avoid such overkill and hence be 
of great advantage. Because they “adapt” to the solution’s trajectory, they are said to have 
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FIGURE 23.1 


An example of a solution of an ODE that exhibits an abrupt change. Automatic step-size 
adjustment has great advantages for such cases. 


adaptive step-size control. Implementation of such approaches requires that an estimate of 
the local truncation error be obtained at each step. This error estimate can then serve as a 
basis for either shortening or lengthening the step size. 

Before proceeding, we should mention that aside from solving ODEs, the methods 
described in this chapter can also be used to evaluate definite integrals. The evaluation of 
the definite integral 

b 


I= J} f(x)dx 


is equivalent to solving the differential equation 


dy 

gT 
for y(b) given the initial condition y(a) = 0. Thus, the following techniques can be em- 
ployed to efficiently evaluate definite integrals involving functions that are generally 
smooth but exhibit regions of abrupt change. 

There are two primary approaches to incorporate adaptive step-size control into one- 
step methods. Step halving involves taking each step twice, once as a full step and then as 
two half steps. The difference in the two results represents an estimate of the local trunca- 
tion error. The step size can then be adjusted based on this error estimate. 

In the second approach, called embedded RK methods, the local truncation error is es- 
timated as the difference between two predictions using different-order RK methods. These 
are currently the methods of choice because they are more efficient than step halving. 


231 ADAPTIVE RUNGE-KUTTA METHODS 617 


The embedded methods were first developed by Fehlberg. Hence, they are sometimes 
referred to as RK-Fehlberg methods. At face value, the idea of using two predictions of dif- 
ferent order might seem too computationally expensive. For example, a fourth- and fifth- 
order prediction amounts to a total of 10 function evaluations per step [recall Eqs. (22.44) 
and (22.45)]. Fehlberg cleverly circumvented this problem by deriving a fifth-order RK 
method that employs most of the same function evaluations required for an accompanying 
fourth-order RK method. Thus, the approach yielded the error estimate on the basis of only 
six function evaluations! 


23.1.1 MATLAB Functions for Nonstiff Systems 


Since Fehlberg originally developed his approach, other even better approaches have been 
developed. Several of these are available as built-in functions in MATLAB. 


ode23. The ode23 function uses the BS23 algorithm (Bogacki and Shampine, 1989; 
Shampine, 1994), which simultaneously uses second- and third-order RK formulas to solve 
the ODE and make error estimates for step-size adjustment. The formulas to advance the 
solution are 


Yai = Yi + I (2k, + 3k, + 4k,)h (23.1) 
where 

ki =f (t, y;) (23.1a) 

k, =f (i + xh, yı + Fkh) (23.1b) 

k =f(t+žh, y+ Ähh) (23.10) 


The error is estimated as 


n= 4 (=5k; + 6k, + 8k; — 9k,)h (23.2) 
where 
ky =f Gin Yaw (23.2a) 


Note that although there appear to be four function evaluations, there are really only three 
because after the first step, the k, for the present step will be the k, from the previous step. 
Thus, the approach yields a prediction and error estimate based on three evaluations rather 
than the five that would ordinarily result from using second- (two evaluations) and third- 
order (three evaluations) RK formulas in tandem. 

After each step, the error is checked to determine whether it is within a desired toler- 
ance. If it is, the value of y,,, is accepted, and k, becomes k, for the next step. If the error is 
too large, the step is repeated with reduced step sizes until the estimated error satisfies 


E < max(RelTol x |y|, AbsTol) (23.3) 
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where RelTol is the relative tolerance (default = 10°) and AbsTol is the absolute tolerance 
(default = 10%). Observe that the criteria for the relative error uses a fraction rather than a 
percent relative error as we have done on many occasions prior to this point. 


ode45. The ode45 function uses an algorithm developed by Dormand and Prince (1980), 
which simultaneously uses fourth- and fifth-order RK formulas to solve the ODE and make 
error estimates for step-size adjustment. MATLAB recommends that ode45 is the best func- 
tion to apply as a “first try” for most problems. 


ode113. The ode113 function uses a variable-order Adams-Bashforth-Moulton solver. It is 
useful for stringent error tolerances or computationally intensive ODE functions. Note that 
this is a multistep method as we will describe subsequently in Sec. 23.2. 

These functions can be called in a number of different ways. The simplest approach is 


[t, y] = ode45(odefun, tspan, y0) 


where y is the solution array where each column is one of the dependent variables and 
each row corresponds to a time in the column vector t, odefun is the name of the function 
returning a column vector of the right-hand-sides of the differential equations, tspan speci- 
fies the integration interval, and yO = a vector containing the initial values. 

Note that tspan can be formulated in two ways. First, if it is entered as a vector of two 
numbers, 


tspan = [ti tf]; 
the integration is performed from ti to tf. Second, to obtain solutions at specific times 
t0, t1, ... , tn(all increasing or all decreasing), use 

tspan = [t0 t1 ... tn]; 

Here is an example of how ode45 can be used to solve a single ODE, y’ = 4e°* — 0.5y 
from t = 0 to 4 with an initial condition of y(0) = 2. Recall from Example 22.1 that the 


analytical solution at t = 4 is 75.33896. Representing the ODE as an anonymous function, 
ode45 can be used to generate the same result numerically as 


>> dydt=@(t,y) 4*exp(0.8*t)-0.5*y; 
>> [t,y]=ode45(dydt,[0 4],2); 
>> y(length(t) ) 


ans = 
75.3390 


As described in the following example, the ODE is typically stored in its own M-file when 
dealing with systems of equations. 
Using MATLAB to Solve a System of ODEs 


Problem Statement. Employ ode45 to solve the following set of nonlinear ODEs from 
t = 0 to 20: 


dy 
r= 1.2y; — 0.6y, y2 T = —0.8y, + 0.3y, yz 


where y, = 2 and y, = 1 at t = 0. Such equations are referred to as predator-prey equations. 
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Solution. Before obtaining a solution with MATLAB, you must create a function to com- 
pute the right-hand side of the ODEs. One way to do this is to create an M-file as in 


function yp = predprey(t,y) 
yp = [1.2*y(1)-0.6*y(1)*y(2) ;-0.8*y(2)+0.3*y(1)*y(2)]; 


We stored this M-file under the name: predprey.m. 
Next, enter the following commands to specify the integration range and the initial 
conditions: 


>> tspan = [0 20]; 
>> y0 = [2, 1]; 


The solver can then be invoked by 
>> [t,y] = ode45(@predprey, tspan, y0); 


This command will then solve the differential equations in predprey.m over the range 
defined by tspan using the initial conditions found in y0. The results can be displayed by 


simply typing 
>> plot(t,y) 


which yields Fig. 23.2. 
In addition to a time series plot, it is also instructive to generate a phase-plane plot— 
that is, a plot of the dependent variables versus each other by 


>> plot(y(:,1),y(:,2)) 
which yields Fig. 23.3. 


FIGURE 23.2 
Solution of predator-prey model with MATLAB. 
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FIGURE 23.3 
Phase-plane plot of predator-prey model with MATLAB. 


As in the previous example, the MATLAB solver uses default parameters to control vari- 
ous aspects of the integration. In addition, there is also no control over the differential equa- 
tions’ parameters. To have control over these features, additional arguments are included as in 

[t, y] = ode45(odefun, tspan, y0, options, p1, p2,...) 


where options is a data structure that is created with the odeset function to control features 
of the solution, and p1, p2,... are parameters that you want to pass into odefun. 

The odeset function has the general syntax 

options = odeset('par,',val,,'par>',Vvalz,...) 
where the parameter par; has the value va7;. A complete listing of all the possible param- 
eters can be obtained by merely entering odeset at the command prompt. Some commonly 
used parameters are 


'RelTol' Allows you to adjust the relative tolerance. 

‘AbsTol’ Allows you to adjust the absolute tolerance. 

‘InitialStep' The solver automatically determines the initial step. This option allows 
you to set your own. 

'MaxStep' The maximum step defaults to one-tenth of the tspan interval. This option 


allows you to override this default. 


Using odeset to Control Integration Options 


Problem Statement. Use ode23 to solve the following ODE from t = 0 to 4: 


2 — 10e7-2°/120.075)"] ~ 0.6y 
where y(O) = 0.5. Obtain solutions for the default (10°) and for a more stringent ao^ 


relative error tolerance. 
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Solution. First, we will create an M-file to compute the right-hand side of the ODE: 
function yp = dydt(t, y) 

yp = 10*exp(-(t-2)*(t-2) /(2*.07542) )-0.6*y; 

Then, we can implement the solver without setting the options. Hence the default value for 
the relative error (10°) is automatically used: 


>> ode23(@dydt, [0 4], 0.5); 


Note that we have not set the function equal to output variables [t, y]. When we imple- 
ment one of the ODE solvers in this way, MATLAB automatically creates a plot of the 
results displaying circles at the values it has computed. As in Fig. 23.4a, notice how ode23 
takes relatively large steps in the smooth regions of the solution whereas it takes smaller 
steps in the region of rapid change around t = 2. 

We can obtain a more accurate solution by using the odeset function to set the relative 
error tolerance to 10+: 


>> options=odeset( 'RelTol',1le-4); 
>> ode23(@dydt, [0, 4], 0.5, options); 


As in Fig. 23.4b, the solver takes more small steps to attain the increased accuracy. 


FIGURE 23.4 
Solution of ODE with MATLAB. For (6), a smaller relative error tolerance is used and hence 
many more steps are taken. 


(a) RelTol = 107° (b) RelTol = 107 


23.1.2 Events 


MATLAB’s ODE solvers are commonly implemented for a prespecified integration inter- 
val. That is, they are often used to obtain a solution from an initial to a final value of the de- 
pendent variable. However, there are many problems where we do not know the final time. 

A nice example relates to the free-falling bungee jumper that we have been using 
throughout this book. Suppose that the jump master inadvertently neglects to attach the 
cord to the jumper. The final time for this case, which corresponds to the jumper hitting 
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the ground, is not a given. In fact, the objective of solving the ODEs would be to determine 
when the jumper hit the ground. 

MATLAB’s events option provides a means to solve such problems. It works by solv- 
ing differential equations until one of the dependent variables reaches zero. Of course, 
there may be cases where we would like to terminate the computation at a value other than 
zero. As described in the following paragraphs, such cases can be readily accommodated. 

We will use our bungee jumper problem to illustrate the approach. The system of 
ODEs can be formulated as 


dx- 
dt 

dv_,_ “a 

dt TET m Pll 


where x = distance (m), t = time (s), v = velocity (m/s) where positive velocity is in the 
downward direction, g = the acceleration of gravity ( = 9.81 m/s’), c4 = a second-order 
drag coefficient (kg/m), and m = mass (kg). Note that in this formulation, distance and 
velocity are both positive in the downward direction, and the ground level is defined as 
zero distance. For the present example, we will assume that the jumper is initially located 
200 m above the ground and the initial velocity is 20 m/s in the upward direction—that is, 
x(0) = -200 and v(0) = 20. 

The first step is to express the system of ODEs as an M-file function: 

function dydt=freefall(t,y,cd,m) 

% y(1) = x and y(2) =v 

grav=9.81; 

dydt=[y(2) ;grav—cd/m*y(2)*abs(y(2))]; 
In order to implement the event, two other M-files need to be developed. These are 
(1) a function that defines the event and (2) a script that generates the solution. 

For our bungee jumper problem, the event function (which we have named endevent) 
can be written as 

function [detect,stopint ,direction]=endevent(t,y,varargin) 

% Locate the time when height passes through zero 

% and stop integration. 

detect=y(1); % Detect height = 0 

stopint=1; % Stop the integration 

direction=0; % Direction does not matter 
This function is passed the values of the independent (t) and dependent variables (y) 
along with the model parameters (varargin). It then computes and returns three variables. 
The first, detect, specifies that MATLAB should detect the event when the dependent vari- 
able y(1) equals zero—that is, when the height x = 0. The second, stopint, is set to 1. This 
instructs MATLAB to stop when the event occurs. The final variable, direction, is set to 
0 if all zeros are to be detected (this is the default), +1 if only the zeros where the event 
function increases are to be detected, and —1 if only the zeros where the event function 
decreases are to be detected. In our case, because the direction of the approach to zero is 
unimportant, we set direction to zero.! 


' Note that, as mentioned previously, we might want to detect a nonzero event. For example, we might want to 
detect when the jumper reached x = 5. To do this, we would merely set detect = y(1) - 5. 
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Finally, a script can be developed to generate the solution: 


opts=odeset('events' ,@endevent) ; 


yO=[-200 -20]; 

[t,y, te, ye] =ode45(@freefall,[0 inf],y0,opts,0.25,68.1); 
te,ye 

plot(t,-y(:,1),'-',t,y(:,2),'--', 'LineWidth' , 2) 


legend('Height (m)','Velocity (m/s)') 
xlabel('time (s)'); 
ylabel('x (m) and v (m/s)') 


In the first line, the odeset function is used to invoke the events option and specify that 
the event we are seeking is defined in the endevent function. Next, we set the initial condi- 
tions (y0) and the integration interval (tspan). Observe that because we do not know when 
the jumper will hit the ground, we set the upper limit of the integration interval to infinity. 
The third line then employs the ode45 function to generate the actual solution. As in all of 
MATLAB’s ODE solvers, the function returns the answers in the vectors t and y. In addition, 
when the events option is invoked, ode45 can also return the time at which the event occurs 
(te), and the corresponding values of the dependent variables (ye). The remaining lines of the 
script merely display and plot the results. When the script is run, the output is displayed as 
te = 
9.5475 
ye = 
0.0000 46.2454 


The plot is shown in Fig. 23.5. Thus, the jumper hits the ground in 9.5475 s with a velocity 
of 46.2454 m/s. 


FIGURE 23.5 
MATLAB-generated plot of the height above the ground and velocity of the free-falling 
bungee jumper without the cord. 
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23.2 


MULTISTEP METHODS 


The one-step methods described in the previous sections utilize information at a single 
point ¢; to predict a value of the dependent variable y,,, at a future point t, (Fig. 23.6). 
Alternative approaches, called multistep methods (Fig. 23.6b), are based on the insight 
that, once the computation has begun, valuable information from previous points is at our 
command. The curvature of the lines connecting these previous values provides informa- 
tion regarding the trajectory of the solution. Multistep methods exploit this information to 
solve ODEs. In this section, we will present a simple second-order method that serves to 
demonstrate the general characteristics of multistep approaches. 


23.2.1 The Non-Self-Starting Heun Method 
Recall that the Heun approach uses Euler’s method as a predictor [Eq. (22.15)]: 


Ya =V: +f (tp Yh (23.4) 


and the trapezoidal rule as a corrector [Eq. (22.17)]: 


0 
Sf (tis Y;) +f (tis Yet) 
Vier = Vit 7 h oo 


Thus, the predictor and the corrector have local truncation errors of O(h?) and O(h*), 
respectively. This suggests that the predictor is the weak link in the method because it 
has the greatest error. This weakness is significant because the efficiency of the iterative 
corrector step depends on the accuracy of the initial prediction. Consequently, one way 
to improve Heun’s method is to develop a predictor that has a local error of O(h*). This 


FIGURE 23.6 
Graphical depiction of the fundamental difference between (a) one-step and (b) multistep 
methods for solving ODEs. 
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can be accomplished by using Euler’s method and the slope at y,, and extra information 
from a previous point y; ,, as in 


Von = Yi +S Gy 2h (23.6) 


This formula attains O(h*) at the expense of employing a larger step size 2h. In addi- 
tion, note that the equation is not self-starting because it involves a previous value of 
the dependent variable y, ,. Such a value would not be available in a typical initial-value 
problem. Because of this fact, Eqs. (23.5) and (23.6) are called the non-self-starting Heun 
method. As depicted in Fig. 23.7, the derivative estimate in Eq. (23.6) is now located at 
the midpoint rather than at the beginning of the interval over which the prediction is made. 
This centering improves the local error of the predictor to O(h°). 


FIGURE 23.7 
A graphical depiction of the non-self-starting Heun method. (a) The midpoint method that is 
used as a predictor. (b) The trapezoidal rule that is employed as a corrector. 
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The non-self-starting Heun method can be summarized as 
Predictor (Fig. 23.7a): Yeu =y" +f(t, y; )2h (23.7) 


j-1 
f (tis y) +f (t, b y, ) 
Corrector (Fig. 23.7b): y}; =y" + J aA (23.8) 


(forj =1,2,...,m) 


where the superscripts denote that the corrector is applied iteratively from j = 1 to m to 
obtain refined solutions. Note that y;” and y,", are the final results of the corrector iterations 
at the previous time steps. The iterations are terminated based on an estimate of the ap- 
proximate error, 

j il 
— (Yan 7 Yim 


Vist 


x 100% (23.9) 


When Je,| is less than a prespecified error tolerance e, the iterations are terminated. At this 
point, j = m. The use of Eqs. (23.7) through (23.9) to solve an ODE is demonstrated in the 
following example. 


Non-Self-Starting Heun’s Method 


Problem Statement. Use the non-self-starting Heun method to perform the same com- 
putations as were performed previously in Example 22.2 using Heun’s method. That is, 
integrate y’ = 4e°*’ — 0.5y from t = 0 to 4 with a step size of 1. As with Example 22.2, the 
initial condition at t = 0 is y = 2. However, because we are now dealing with a multistep 
method, we require the additional information that y is equal to —0.3929953 at t =-1. 


Solution. The predictor [Eq. (23.7)] is used to extrapolate linearly from tf = —1 to 1: 
yi = —0.3929953 + [4e°8 — 0,5(2)| 2 = 5.607005 
The corrector [Eq. (23.8)] is then used to compute the value: 


0.80) _ 0.81) L 
y =24 4e 0.5(2) + 15 0.5(5.607005) 1 = 6.549331 


which represents a true percent relative error of -5.73% (true value = 6.194631). This error 
is somewhat smaller than the value of —8.18% incurred in the self-starting Heun. 
Now, Eq. (23.8) can be applied iteratively to improve the solution: 


3 + 4e°8 — 0.5(6.549331) I 
2 


which represents an error of —1.92%. An approximate estimate of the error can be deter- 
mined using Eq. (23.9): 


y,=2+ = 6.313749 


6.313749 — 6.549331 _ 
6.313749 x 100% = 3.7% 


lEal = 
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Equation (23.8) can be applied iteratively until £, falls below a prespecified value of 
€, AS was the case with the Heun method (recall Example 22.2), the iterations converge 
on a value of 6.36087 (e, = —2.68%). However, because the initial predictor value is more 
accurate, the multistep method converges at a somewhat faster rate. 

For the second step, the predictor is 


y,=2 + [4e°8 — 0.5(6.36087)| 2 = 13.44346 e, = 9.43% 


which is superior to the prediction of 12.0826 (e, = 18%) that was computed with the 
original Heun method. The first corrector yields 15.76693 (e, = 6.8%), and subsequent 
iterations converge on the same result as was obtained with the self-starting Heun method: 
15.30224 (e, = —3.09%). As with the previous step, the rate of convergence of the corrector 
is somewhat improved because of the better initial prediction. 


23.2.2 Error Estimates 


Aside from providing increased efficiency, the non-self-starting Heun can also be used to 
estimate the local truncation error. As with the adaptive RK methods in Sec. 23.1, the error 
estimate then provides a criterion for changing the step size. 

The error estimate can be derived by recognizing that the predictor is equivalent to the 
midpoint rule. Hence, its local truncation error is (Table 19.4) 


E,= 1 P yE) = I K f'E) (23.10) 


where the subscript p designates that this is the error of the predictor. This error estimate 
can be combined with the estimate of y,,, from the predictor step to yield 


True value = Yeu + 1 hy) (23.11) 


By recognizing that the corrector is equivalent to the trapezoidal rule, a similar esti- 
mate of the local truncation error for the corrector is (Table 19.2) 


= we 34,8) = I 3 pr 
E=- PYG) = - Fa E (23.12) 
This error estimate can be combined with the corrector result y,,, to give 


1 


True value = y 7) We yPE,) (23.13) 


itl 
Equation (23.11) can be subtracted from Eq. (23.13) to yield 


m 0 

O= Yat — Yea - 5 h’ yE) (23.14) 
where ¢ is now between t; ; and ¢t,. Now, dividing Eq. (23.14) by 5 and rearranging the 
result gives 


0 m 
Yiri — Yii 
5 


__ 1 p, 
=- 77y (E) (23.15) 


628 


ADAPTIVE METHODS AND STIFF SYSTEMS 


EXAMPLE 23.4 


Notice that the right-hand sides of Eqs. (23.12) and (23.15) are identical, with the excep- 
tion of the argument of the third derivative. If the third derivative does not vary appreciably 
over the interval in question, we can assume that the right-hand sides are equal, and there- 
fore, the left-hand sides should also be equivalent, as in 
0 m 

E,=- na (23.16) 
Thus, we have arrived at a relationship that can be used to estimate the per-step truncation 
error on the basis of two quantities that are routine by-products of the computation: the 
predictor Ond and the corrector (y;};). 
Estimate of Per-Step Truncation Error 


Problem Statement. Use Eq. (23.16) to estimate the per-step truncation error of 
Example 23.3. Note that the true values at t = 1 and 2 are 6.194631 and 14.84392, 
respectively. 


Solution. At tı = 1, the predictor gives 5.607005 and the corrector yields 6.360865. 
These values can be substituted into Eq. (23.16) to give 


g= — 6.360865 — 5.607005 _ _9,150729 


which compares well with the exact error, 


E, = 6.194631 — 6.360865 = —0.1662341 


At t,,, = 2, the predictor gives 13.44346 and the corrector yields 15.30224, which can 


be used to compute 


E =- 15.30224 — 13.44346 _ _4 37176 


5 


which also compares favorably with the exact error, E, = 14.84392 — 15.30224 = 
—0.45831. 


23.3 


The foregoing has been a brief introduction to multistep methods. Additional information 
can be found elsewhere (e.g., Chapra and Canale, 2010). Although they still have their 
place for solving certain types of problems, multistep methods are usually not the method 
of choice for most problems routinely confronted in engineering and science. That said, 
they are still used. For example, the MATLAB function ode113 is a multistep method. We 
have therefore included this section to introduce you to their basic principles. 


STIFFNESS 


Stiffness is a special problem that can arise in the solution of ordinary differential equa- 
tions. A stiff system is one involving rapidly changing components together with slowly 
changing ones. In some cases, the rapidly varying components are ephemeral transients 
that die away quickly, after which the solution becomes dominated by the slowly varying 
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FIGURE 23.8 

Plot of a stiff solution of a single ODE. Although the solution appears to start at 1, there is 
actually a fast transient from y = O to 1 that occurs in less than the 0.005 time unit. This 
transient is perceptible only when the response is viewed on the finer timescale in the inset. 


components. Although the transient phenomena exist for only a short part of the integration 
interval, they can dictate the time step for the entire solution. 
Both individual and systems of ODEs can be stiff. An example of a single stiff ODE is 


A = —1000y + 3000 — 2000e~ (23.17) 
If y(0) = 0, the analytical solution can be developed as 
y=3 — 0.998e 1 — 2.:002e7 (23.18) 


As in Fig. 23.8, the solution is initially dominated by the fast exponential term (e7'0”). 


After a short period (t < 0.005), this transient dies out and the solution becomes governed 
by the slow exponential (e™). 

Insight into the step size required for stability of such a solution can be gained by ex- 
amining the homogeneous part of Eq. (23.17): 


dy 
Z= -ay (23.19) 


If y(O) = yo calculus can be used to determine the solution as 
y=ye™ 


Thus, the solution starts at yọ and asymptotically approaches zero. 
Euler’s method can be used to solve the same problem numerically: 


Va = Er ae 
Substituting Eq. (23.19) gives 


Vier =Y; — ayh 


630 


ADAPTIVE METHODS AND STIFF SYSTEMS 


EXAMPLE 23.5 


or 
Ya =y; (1 — ah) (23.20) 


The stability of this formula clearly depends on the step size h. That is, |1 — ah| must be less 
than 1. Thus, if h > 2/a, |y;| > œ asi > oo. 

For the fast transient part of Eq. (23.18), this criterion can be used to show that the 
step size to maintain stability must be < 2/1000 = 0.002. In addition, we should note that, 
whereas this criterion maintains stability (i.e., a bounded solution), an even smaller step 
size would be required to obtain an accurate solution. Thus, although the transient occurs 
for only a small fraction of the integration interval, it controls the maximum allowable 
step size. 

Rather than using explicit approaches, implicit methods offer an alternative remedy. 
Such representations are called implicit because the unknown appears on both sides of the 
equation. An implicit form of Euler’s method can be developed by evaluating the derivative 
at the future time: 


AY ig 
Yi+ı =r 


This is called the backward, or implicit, Euler’s method. Substituting Eq. (23.19) yields 
Vier = Yi T AY h 


which can be solved for 


— 
Yi+ı = 1 +ah (23.21) 


For this case, regardless of the size of the step, |y;| > 0 as i > oo. Hence, the approach is 
called unconditionally stable. 
Explicit and Implicit Euler 


Problem Statement. Use both the explicit and implicit Euler methods to solve Eq. (23.17), 
where y(0) = 0. (a) Use the explicit Euler with step sizes of 0.0005 and 0.0015 to solve for 
y between ¢ = 0 and 0.006. (b) Use the implicit Euler with a step size of 0.05 to solve for y 
between 0 and 0.4. 


Solution. (a) For this problem, the explicit Euler’s method is 
Y1 = Y; + (—1000y; + 3000 — 2000e~yh 


The result for = 0.0005 is displayed in Fig. 23.9a along with the analytical solution. 
Although it exhibits some truncation error, the result captures the general shape of the 
analytical solution. In contrast, when the step size is increased to a value just below the 
stability limit (A = 0.0015), the solution manifests oscillations. Using A > 0.002 would 
result in a totally unstable solution—that is, it would go infinite as the solution progressed. 


(b) The implicit Euler’s method is 
Vie = Y; + (—1000y,,, + 3000 — 2000e‘i#1)h 
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FIGURE 23.9 
Solution of a stiff ODE with (a) the explicit and (b) implicit Euler methods. 


Now because the ODE is linear, we can rearrange this equation so that y,,, is isolated on 
the left-hand side: 


_ y; + 3000h — 2000he ‘i+! 
Fie = 1 + 1000h 


The result for A = 0.05 is displayed in Fig. 23.9b along with the analytical solution. Notice 
that even though we have used a much bigger step size than the one that induced instability 
for the explicit Euler, the numerical result tracks nicely on the analytical solution. 


Systems of ODEs can also be stiff. An example is 


dy, 

TA = 5y; + 3y (23.224) 

dy, _ 

= 100y, — 301y, (23.22b) 
For the initial conditions y,(0) = 52.29 and y,(0) = 83.82, the exact solution is 

yı = 52.962 07" — 0.676 OI (23.23a) 


Yo = 17.8373" + 65,99e 32-010 (23.23b) 
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EXAMPLE 23.6 


Note that the exponents are negative and differ by about two orders of magnitude. As with 
the single equation, it is the large exponents that respond rapidly and are at the heart of the 
system’s stiffness. 

An implicit Euler’s method for systems can be formulated for the present example as 


Vier = Yii + SY a + 32 dA (23.24a) 

Yairi = Yai + (1000y; a1 — 301y3 4 )A (23.24b) 
Collecting terms gives 

A + SAY — 3241 = Yia (23.25a) 

—100y, aa + + 3010Y 54) = Yo, (23.25b) 


Thus, we can see that the problem consists of solving a set of simultaneous equations for 
each time step. 

For nonlinear ODEs, the solution becomes even more difficult since it involves solving 
a system of nonlinear simultaneous equations (recall Sec. 12.2). Thus, although stability is 
gained through implicit approaches, a price is paid in the form of added solution complexity. 


23.3.1 MATLAB Functions for Stiff Systems 
MATLAB has a number of built-in functions for solving stiff systems of ODEs. These are 


ode15s. This function is a variable-order solver based on numerical differentiation 
formulas. It is a multistep solver that optionally uses the Gear backward differentiation 
formulas. This is used for stiff problems of low to medium accuracy. 


ode23s. This function is based on a modified Rosenbrock formula of order 2. Because it 
is a one-step solver, it may be more efficient than ode15s at crude tolerances. It can solve 
some kinds of stiff problems better than ode15s. 

ode23t. This function is an implementation of the trapezoidal rule with a “free” inter- 
polant. This is used for moderately stiff problems with low accuracy where you need a 
solution without numerical damping. 


ode23tb. This is an implementation of an implicit Runge-Kutta formula with a first stage 
that is a trapezoidal rule and a second stage that is a backward differentiation formula of 
order 2. This solver may also be more efficient than ode15s at crude tolerances. 


MATLAB for Stiff ODEs 


Problem Statement. The van der Pol equation is a model of an electronic circuit that 
arose back in the days of vacuum tubes, 


d “1 2 dy, 
or —p(l —y) Git =0 (E23.6.1) 
The solution to this equation becomes progressively stiffer as u gets large. Given the initial 
conditions, y,(0) = dy,/dt = 1, use MATLAB to solve the following two cases: (a) for 
pt = 1, use ode45 to solve from t = 0 to 20; and (b) for u = 1000, use ode23s to solve from 
t = 0 to 6000. 
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Solution. (a) The first step is to convert the second-order ODE into a pair of first-order 
ODEs by defining 


dy, _ 
dt y2 


Using this equation, Eq. (E23.6.1) can be written as 
dy, 2 
=a -= yi)»; =0 
An M-file can now be created to hold this pair of differential equations: 


function yp = vanderpol(t,y,mu 
y(2)- 


) 
yp = [y(2) ;mu*(1-y(1)*2)* y(1)]; 


Notice how the value of yz is passed as a parameter. As in Example 23.1, ode45 can be in- 
voked and the results plotted: 


>> [t,y] = ode45(@vanderpol,[0 20],[1 1],[],1); 

a plot(t,y(: 1), aes it,y(:,2), ‘--") 

>> legend('y1','y2'); 

Observe that because we are not specifying any options, we must use open brackets [] as 
a place holder. The smooth nature of the plot (Fig. 23.10a) suggests that the van der Pol 
equation with u = | is not a stiff system. 


(b) If a standard solver like ode45 is used for the stiff case (u = 1000), it will fail miserably 
(try it, if you like). However, ode23s does an efficient job: 


>> [t,y] = ode23s(@vanderpol,[0 6000],[1 1],[],1000); 
>> plot(t,y(:,1)) 


FIGURE 23.10 
Solutions for van der Pol’s equation. (a) Nonstiff form solved with ode45 and (b) stiff form 
solved with ode23s. 
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We have only displayed the y, component because the result for y, has a much larger 
scale. Notice how this solution (Fig. 23.10b) has much sharper edges than is the case in 
Fig. 23.10a. This is a visual manifestation of the “stiffness” of the solution. 


23.4 


EXAMPLE 23.7 


MATLAB APPLICATION: BUNGEE JUMPER WITH CORD 


In this section, we will use MATLAB to solve for the vertical dynamics of a jumper con- 
nected to a stationary platform with a bungee cord. As developed at the beginning of 
Chap. 22, the problem consisted of solving two coupled ODEs for vertical position and 
velocity. The differential equation for position is 


dx _ 
= (23.26) 
The differential equation for velocity is different depending on whether the jumper has fallen 
to a distance where the cord is fully extended and begins to stretch. Thus, if the distance 
fallen is less than the cord length, the jumper is only subject to gravitational and drag forces, 


dv _ . Cd 2 

a E sign(v) z7? (23.27a) 
Once the cord begins to stretch, the spring and dampening forces of the cord must also be 
included: 


a =g — sign) 40? - É @ — L) Lo (23.27b) 


The following example shows how MATLAB can be used to solve this problem. 


Bungee Jumper with Cord 


Problem Statement. Determine the position and velocity of a bungee jumper with the 
following parameters: L = 30 m, g = 9.81 m/s”, m = 68.1 kg, c4 = 0.25 kg/m, k = 40 N/m, 
and y = 8 N - s/m. Perform the computation from ¢ = 0 to 50 s and assume that the initial 
conditions are x(0) = v(0) = 0. 


Solution. The following M-file can be set up to compute the right-hand sides of the ODEs: 


function dydt = bungee(t,y,L,cd,m,k,gamma) 

g = 9.81; 

cord = 0; 

if y(1) > L %determine if the cord exerts a force 
cord = k/m*(y(1)-L) +gamma/m*y(2) ; 

end 

dydt = [y(2); g - sign(y(2))*cd/m*y(2)*2 - cord]; 


Notice that the derivatives are returned as a column vector because this is the format 
required by the MATLAB solvers. 
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Because these equations are not stiff, we can use ode45 to obtain the solutions and 
display them on a plot: 


>> [t,y] = ode45(@bungee, [0 50],[0 0],[],30,0.25,68.1,40,8); 
>> plot(t,-y(:,1),'-',t.y(:,2),':") 
>> legend('x (m)','v (m/s)') 


As in Fig. 23.11, we have reversed the sign of distance for the plot so that negative distance 
is in the downward direction. Notice how the simulation captures the jumper’s bouncing 
motion. 


FIGURE 23.11 
Plot of distance and velocity of a bungee jumper. 
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PAo Ng PLINY’S INTERMITTENT FOUNTAIN 


Background. The Roman natural philosopher, Pliny the Elder, purportedly had an inter- 
mittent fountain in his garden. As in Fig. 23.12, water enters a cylindrical tank at a constant 
flow rate Q;,, and fills until the water reaches y,;.,. At this point, water siphons out of the 
tank through a circular discharge pipe, producing a fountain at the pipe’s exit. The fountain 
runs until the water level decreases to Yw, whereupon the siphon fills with air and the 
fountain stops. The cycle then repeats as the tank fills until the water reaches Ypign, and the 
fountain flows again. 

When the siphon is running, the outflow Q 
formula based on Torricelli’s law: 


can be computed with the following 


out 


Ori ON ON are (23.28) 
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FIGURE 23.12 
An intermittent fountain. 


Neglecting the volume of water in the pipe, compute and plot the level of the water in the 
tank as a function of time over 100 seconds. Assume an initial condition of an empty tank 
y(0) = 0, and employ the following parameters for your computation: 


R,=0.05 m r = 0.007 m Yiow = 0.025 m 
Yhiga = 0.1 m C=06 g = 9.81 m/s” 
O50) mS 


Solution. When the fountain is running, the rate of change in the tank’s volume V (m°) is 
determined by a simple balance of inflow minus the outflow: 


a ae 
dt Zj Oin oF (23.29) 


where V = volume (m°). Because the tank is cylindrical, V = z R y. Substituting this rela- 
tionship along with Eq. (23.28) into Eq. (23.29) gives 


dy Q,-—C Qgyar? 
dt aR 


(23.30) 


When the fountain is not running, the second term in the numerator goes to zero. We 
can incorporate this mechanism in the model by introducing a new dimensionless variable 
siphon that equals zero when the fountain is off and equals one when it is flowing: 


dy _ Qin — siphon x C \/2gyar? 


23-31) 
dt aR 


In the present context, siphon can be thought of as a switch that turns the fountain off and 
on. Such two-state variables are called Boolean or logical variables, where zero is equiva- 
lent to false and one is equivalent to true. 
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23.5 CASE STUDY continued 


Next we must relate siphon to the dependent variable y. First, siphon is set to zero 
whenever the level falls below y,,,,. Conversely, siphon is set to one whenever the level rises 
above y,;., The following M-file function follows this logic in computing the derivative: 


function dy = Plinyode(t,y) 

global siphon 

Rt = 0.05; r = 0.007; yhi = 0.1; ylo = 0.025; 
C = 0.6; g = 9.81; Qin = 0.00005; 

if y(1) <= ylo 


siphon = 0; 
elseif y(1) >= yhi 
siphon = 1; 
end 


gout = siphon e Sroyee((s = * fo) = WCL))) foal “ete Se 
dy = (Qin - Qout) / (pi * Rt ^ 2); 


Notice that because its value must be maintained between function calls, siphon is declared 
as a global variable. Although the use of global variables is not encouraged (particularly in 
larger programs), it is useful in the present context. 

The following script employs the built-in ode45 function to integrate Plinyode and gen- 
erate a plot of the solution: 


global siphon 

siphon = 0; 

tspan = [0 100]; yO = 0; 
[tp,yp]=ode45(@Plinyode, tspan, yO) ; 
plot(tp,yp) 

xlabel('time, (s)') 

ylabel('water level in tank, (m)') 


As shown in Fig. 23.13, the result is clearly incorrect. Except for the original filling 
period, the level seems to start emptying prior to reaching y,;.,. Similarly, when it is drain- 
ing, the siphon shuts off well before the level drops to y,,,,. 

At this point, suspecting that the problem demands more firepower than the trusty 
ode45 routine, you might be tempted to use one of the other MATLAB ODE solvers such 
as ode23s or ode23tb. But if you did, you would discover that although these routines yield 
somewhat different results, they would still generate incorrect solutions. 

The difficulty arises because the ODE is discontinuous at the point that the siphon 
switches on or off. For example, as the tank is filling, the derivative is dependent only on 
the constant inflow and for the present parameters has a constant value of 6.366 x 107° m/s. 
However, as soon as the level reaches Yhign, the outflow kicks in and the derivative abruptly 
drops to —1.013 x 107° m/s. Although the adaptive step-size routines used by MATLAB 
work marvelously for many problems, they often get heartburn when dealing with such 
discontinuities. Because they infer the behavior of the solution by comparing the results of 
different steps, a discontinuity represents something akin to stepping into a deep pothole 
on a dark street. 
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FIGURE 23.13 
The level in Pliny’s fountain versus time as simulated with ode45. 


At this point, your first inclination might be to just give up. After all, if it’s too hard 
for MATLAB, no reasonable person could expect you to come up with a solution. Because 
professional engineers and scientists rarely get away with such excuses, your only recourse 
is to develop a remedy based on your knowledge of numerical methods. 

Because the problem results from adaptively stepping across a discontinuity, you might 
revert to a simpler approach and use a constant, small step size. If you think about it, that’s 
precisely the approach you would take if you were traversing a dark, pothole-filled street. 
We can implement this solution strategy by merely replacing ode45 with the constant-step 
rk4sys function from Chap. 22 (Fig. 22.8). For the script outlined above, the fourth line 
would be formulated as 


[tp,yp] = rk4sys(@Plinyode, tspan, y0,0.0625) ; 


As in Fig. 23.14, the solution now evolves as expected. The tank fills to y,,.,, and then emp- 
ties until it reaches yw when the cycle repeats. 

There are a two take-home messages that can be gleaned from this case study. First, 
although it’s human nature to think the opposite, simpler is sometimes better. After all, to 
paraphrase Einstein, “Everything should be as simple as possible, but no simpler.” Second, 
you should never blindly believe every result generated by the computer. You’ve probably 
heard the old chestnut, “garbage in, garbage out” in reference to the impact of data quality 
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FIGURE 23.14 
The level in Pliny’s fountain versus time as simulated with a small, constant step size using 


the rk4sys function (Fig. 22.8). 


on the validity of computer output. Unfortunately, some individuals think that regardless 
of what went in (the data) and what’s going on inside (the algorithm), it’s always “gospel 
out.” Situations like the one depicted in Fig. 23.13 are particularly dangerous—that is, al- 
though the output is incorrect, it’s not obviously wrong. That is, the simulation does not go 
unstable or yield negative levels. In fact, the solution moves up and down in the manner of 
an intermittent fountain, albeit incorrectly. 

Hopefully, this case study illustrates that even a great piece of software such as 
MATLAB is not foolproof. Hence, sophisticated engineers and scientists always examine 
numerical output with a healthy skepticism based on their considerable experience and 
knowledge of the problems they are solving. 
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PROBLEMS 


23.1 Repeat the same simulations as in Sec. 23.5 for Pliny’s 
fountain, but generate the solutions with ode23, ode23s, and 
ode113. Use subplot to develop a vertical three-pane plot of 
the time series. 
23.2 The following ODEs have been proposed as a model 
of an epidemic: 


dS _ 
A aSI 
dl _ 2 
dt 2S! rI 
dR _ 

a 


where S = the susceptible individuals, / = the infected, R = 
the recovered, a = the infection rate, and r = the recovery 
rate. A city has 10,000 people, all of whom are susceptible. 
(a) If a single infectious individual enters the city at t = 0, 
compute the progression of the epidemic until the number 
of infected individuals falls below 10. Use the following 
parameters: a = 0.002/(person - week) and r = 0.15/d. 
Develop time-series plots of all the state variables. Also 
generate a phase-plane plot of S versus J versus R. 
Suppose that after recovery, there is a loss of immunity 
that causes recovered individuals to become suscep- 
tible. This reinfection mechanism can be computed as 
pR, where p = the reinfection rate. Modify the model to 
include this mechanism and repeat the computations in 
(a) using p = 0.03/d. 

23.3 Solve the following initial-value problem over the 
interval from ft = 2 to 3: 


(b 


mS 


dy _ -1 

a 0.5y +e 
Use the non-self-starting Heun method with a step size 
of 0.5 and initial conditions of y(1.5) = 5.222138 and 
y(2.0) = 4.143883. Iterate the corrector to e, = 0.1%. Com- 
pute the percent relative errors for your results based on the 
exact solutions obtained analytically: y(2.5) = 3.273888 
and y(3.0) = 2.577988. 
23.4 Solve the following initial-value problem over the 
interval from ¢ = 0 to 0.5: 

dy_ 2 
dt TY 
Use the fourth-order RK method to predict the first value 
at t = 0.25. Then use the non-self-starting Heun method to 
make the prediction at t = 0.5. Note: y(0) = 1. 


23.5 Given 
2 = —100,000y + 99,999e* 


(a) Estimate the step size required to maintain stability 
using the explicit Euler method. 

(b) If y(0) = 0, use the implicit Euler to obtain a solution 
from ż = 0 to 2 using a step size of 0.1. 


23.6 Given 
dy _ : 
J 30(sin t — y) + 3 cos t 


If y(0) = 0, use the implicit Euler to obtain a solution from 
t = 0 to 4 using a step size of 0.4. 


23.7 Given 
4% _ 999x, + 1999 
a xy + Xz 
d 
= = —1000x, — 2000x, 


If x,(0) = x,(0) = 1, obtain a solution from ż = 0 to 0.2 using 
a step size of 0.05 with the (a) explicit and (b) implicit Euler 
methods. 

23.8 The following nonlinear, parasitic ODE was suggested 
by Hornbeck (1975): 


Ysy- 
If the initial condition is y(0) = 0.08, obtain a solution from 
t=0to5: 
(a) Analytically. 
(b) Using the fourth-order RK method with a constant step 

size of 0.03125. 
(c) Using the MATLAB function ode45. 
(d) Using the MATLAB function ode23s. 
(e) Using the MATLAB function ode23tb. 
Present your results in graphical form. 
23.9 Recall from Example 20.5 that the humps function ex- 
hibits both flat and steep regions over a relatively short x 
range, 

1 1 
+ = 

— 0.3 +0.01 (x —0.9)? + 0.04 
Determine the value of the definite integral of this function 


between x = 0 and 1 using (a) the quad and (b) the ode45 
functions. 


fQ)= & 
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23.10 The oscillations of a swinging pendulum can be sim- 
ulated with the following nonlinear model: 
2 

ae + 8 sin @=0 

dt? I 
where 0 = the angle of displacement, g = the gravitational 
constant, and / = the pendulum length. For small angular 
displacements, the sin 0 is approximately equal to @ and the 
model can be linearized as 


Use ode45 to solve for 6 as a function of time for both the lin- 
ear and nonlinear models where / = 0.6 m and g = 9.81 m/s’. 
First, solve for the case where the initial condition is for a 
small displacement (8 = z/8 and d0/dt = 0). Then repeat the 
calculation for a large displacement (8 = z/2). For each case, 
plot the linear and nonlinear simulations on the same plot. 
23.11 Employ the events option described in Sec. 23.1.2 
to determine the period of a 1-m long, linear pendulum 
(see description in Prob. 23.10). Compute the period for 
the following initial conditions: (a) 6 = 2/8, (b) 0 = 7/4, 
and (c) 8 = z/2. For all three cases, set the initial angular 
velocity at zero. (Hint: A good way to compute the period 
is to determine how long it takes for the pendulum to reach 
0 = 0 [i.e., the bottom of its arc]). The period is equal to 
four times this value. 

23.12 Repeat Prob. 23.11, but for the nonlinear pendulum 
described in Prob. 23.10. 

23.13 The following system is a classic example of stiff 
ODEs that can occur in the solution of chemical reaction 
kinetics: 


d 
ZA = -0,013c, — 1000c,c; 


dt 
C2 

FT —2500c3c3 

d 

T = -0.013¢, — 1000cc; ~ 2500c,¢, 


Solve these equations from t = 0 to 50 with initial condi- 
tions c,(0) = c,(0) = 1 and c,(0) = 0. If you have access to 
MATLAB software, use both standard (e.g., ode45) and stiff 
(e.g., ode23s) functions to obtain your solutions. 

23.14 The following second-order ODE is considered to be 
stiff: 


Ti = -1001 7%- 1000y 


m 


FIGURE P23.15 


Solve this differential equation (a) analytically and 
(b) numerically for x = 0 to 5. For (b) use an implicit 
approach with h = 0.5. Note that the initial conditions are 
y(0) = 1 and y’ (0) = 0. Display both results graphically. 
23.15 Consider the thin rod of length / moving in the 
x-y plane as shown in Fig. P23.15. The rod is fixed with a pin 
on one end and a mass at the other. Note that g = 9.81 m/s” 
and / = 0.5 m. This system can be solved using 


Let 6(0) = 0 and 6(0) = 0.25 rad/s. Solve using any method 
studied in this chapter. Plot the angle versus time and the 
angular velocity versus time. (Hint: Decompose the second- 
order ODE.) 


23.16 Given the first-order ODE: 


dx _ -700x — 1000e”' 
dt 
at 0)=4 


Solve this stiff differential equation using a numerical 
method over the time period 0 < t < 5. Also solve analyti- 
cally and plot the analytic and numerical solution for both 
the fast and slow transient phases of the time scale. 
23.17 Solve the following differential equation from 
t=O0to2 

dy 

Tae —10y 
with the initial condition y(0) = 1. Use the following tech- 
niques to obtain your solutions: (a) analytically, (b) the ex- 
plicit Euler method, and (c) the implicit Euler method. For 
(b) and (c) use A = 0.1 and 0.2. Plot your results. 
23.18 The Lotka-Volterra equations described in Sec. 22.6 
have been refined to include additional factors that impact 
predator-prey dynamics. For example, over and above 
predation, prey population can be limited by other factors 
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such as space. Space limitation can be incorporated into the 
model as a carrying capacity (recall the logistic model de- 
scribed in Prob. 22.5) as in 


dxa (1 - $) x- bxy 


dt K 
dy 

De dx 
gı TO tky 


where K = the carrying capacity. Use the same parameter 

values and initial conditions as in Sec. 22.6 to integrate these 

equations from t = 0 to 100 using ode45, and develop both 

time series and phase-plane plots of the results. 

(a) Employ a very large value of K = 10° to validate that 
you obtain the same results as in Sec. 22.6. 

(b) Compare (a) with the more realistic carrying capacity of 
K = 200. Discuss your results. 

23.19 Two masses are attached to a wall by linear springs 

(Fig. P23.19). Force balances based on Newton’s second 

law can be written as 


dx, k, k, 
7a =m, 1 ~ Ly) tm, 2-H -w= L) 


Li w, L, w 
i a a. a 
ky ky 
m LAP mM 
AE 
0 
xy X2 


FIGURE P23.19 


23.22 Use the approach and example outlined in Sec. 23.1.2, 
but determine the time, height, and velocity when the bun- 
gee jumper is the farthest above the ground, and generate a 
plot of the solution. 

23.23 As depicted in Fig. P23.23, a double pendulum 
consists of a pendulum attached to another pendulum. We 
indicate the upper and lower pendulums by subscripts 1 and 
2, respectively, and we place the origin at the pivot point of 
the upper pendulum with y increasing upward. We further 
assume that the system oscillates in a vertical plane subject 
to gravity, that the pendulum rods are massless and rigid, 


2 
ax, 2: k, œ -x >= L) and the pendulum masses are considered to be point masses. 
dt m Under these assumptions, force balances can be used to de- 
rive the following equations of motion 
d?0, _ — g(2m, + m,)sin 0, — m, g sin(, — 20,) — 2 sin (0, — 9,)m, (d0,/dt)” L, + (d0,/dt}? L, cos(@, — 4,)) 
de L,2m, + m, — m, cos (20, — 4,) 
d’°0, _ 2sin (0, — 0,)((dO, /dt)” L, (m, +m) + g8 (m, + m,)cos (0,) + (d0,/dt)* L,m,cos(0, — 0,)) 
de L (2m, + m, — m, cos (20, — 4,) 


where k = the spring constants, m = mass, L = the length of the 
unstretched spring, and w = the width of the mass. Com- 
pute the positions of the masses as a function of time 
using the following parameter values: k, = k, = 5, 
m, = Mm, = 2, w, = w, = 5, and L = L, = 2. Set the initial 
conditions as x, = L, and x, = L, + w, + L, + 6. Perform the 
simulation from ź = 0 to 20. Construct time-series plots of 
both the displacements and the velocities. In addition, pro- 
duce a phase-plane plot of x, versus x5. 

23.20 Use ode45 to integrate the differential equations 
for the system described in Prob. 23.19. Generate verti- 
cally stacked subplots of displacements (top) and velocities 
(bottom). Employ the fft function to compute the discrete 
Fourier transform (DFT) of the first mass’s displacement. 
Generate and plot a power spectrum in order to identify the 
system’s resonant frequencies. 

23.21 Perform the same computations as in Prob. 23.20 but 
based on the first floor of the structure in Prob. 22.22. 


Yy 


FIGURE P23.23 
A double pendulum. 
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where the subscripts 1 and 2 designate the top and bot- 
tom pendulum, respectively, 0 = angle (radians) with 
0 = vertical downward and counter-clockwise positive, 
t = time (s), g = gravitational acceleration (= 9.81 m/s”), 
m = mass (kg), and L = length (m). Note that the x and y 
coordinates of the masses are functions of the angles 
as in 


x, =L, sind, yı =—L, cos 0; 


x =x + L sind, y2 = y, — L, cos0, 


(a) Use ode45 to solve for the angles and angular velocities 
of the masses as a function of time from ¢ = 0 to 40 s. 
Employ subplot to create a stacked plot with a time se- 
ries of the angles in the top panel and a state space plot 
of 0, versus @, in the bottom panel. (b) Create an ani- 
mated plot depicting the motion of the pendulum. Test 
your code for the following: 


Case 1 (small displacement): L, = L, = 1 m, m, = m, = 
0.25 kg, with initial conditions: 0, = 0.5 m and 0, = 
d0 /dt = d0,/dt = 0. 
Case 2 (large displacement): L, = L, = 1m, m, = 0.5 kg, 
m, = 0.25 kg, with initial conditions: 9, = 1 m and 0, = 
d0,/dt = d0,/dt = 0. 


23.24 Figure P23.24 shows the forces exerted on a hot air 
balloon system. 


Formulate the drag force as 


Fp = $p, ?A Ca 


where p, = air density (kg/m*), v = velocity (m/s), A = pro- 
jected frontal area (m°), and C, = the dimensionless drag co- 
efficient (= 0.47 for a sphere). Note also that the total mass 
of the balloon consists of two components: 


m=Mo+ Mp 


where mç = the mass of the gas inside the expanded balloon 
(kg), and mp = the mass of the payload (basket, passengers, 
and the unexpanded balloon = 265 kg). Assume that the 
ideal gas law holds (P = pRT), that the balloon is a perfect 
sphere with a diameter of 17.3 m, and that the heated air 
inside the envelope is at roughly the same pressure as the 
outside air. Other necessary parameters are normal atmo- 
spheric pressure, P = 101,300 Pa; gas constant for dry air, 
R = 287 Joules/kg - K; average temperature of air inside the 
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FIGURE P23.24 

Forces on a hot air balloon: Fẹ = buoyancy, F = weight 
of gas, Fp = weight of payload (including the balloon 
envelope), and F, = drag. Note that the direction of the 
drag is downward when the balloon is rising. 


balloon, T = 100 °C; and the normal (ambient) air density, 

p = 1.2 kg/m’. 

(a) Use a force balance to develop the differential equa- 
tion for dv/dt as a function of the model’s fundamental 
parameters. 

(b) At steady-state, calculate the particle’s terminal velocity. 

(c) Use ode45 to compute the velocity and position of the 
balloon from t = 0 to 60 s given the previous parameters 
along with the initial condition: v(0) = 0. Develop a plot 
of your results. 

23.25 Develop a MATLAB script using ode45 to compute 

the velocity, v, and position, z, of a hot air balloon as de- 

scribed in Prob. 23.24. Perform the calculation from t = 0 

to 60 s with a step size of 1.6 s. At z = 200 m, assume that 

part of the payload (100 kg) is dropped out of the balloon. 

Develop a plot of your results. 

23.26 You go on a two-week vacation and place your pet 

goldfish “Freddie” into your bathtub. Note that you dechlo- 

rinate the water first! You then place an air tight plexiglass 
cover over the top of the tub in order to protect Freddie 
from your cat, Beelzebub. You mistakenly mix one table- 
spoon of sugar into the tub (you thought it was fish food!). 
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Unfortunately, there are bacteria in the water (remember 
you got rid of the chlorine!), which break down the sugar 
consuming dissolved oxygen in the process. The oxidation 
reaction follows first-order kinetics with a reaction rate of 
k, = 0.15/d. The tub initially has a sugar concentration of 
20 mgO,/L and an oxygen concentration of 8.4 mgO,/L. 
Note that the mass balances for the sugar (expressed in ox- 
ygen equivalents) and dissolved oxygen can be written as 


ae 
do_ _ 
AG kaL 


where L = sugar concentration expressed as oxygen equiva- 
lents (mg/L), t = time (d), and o = dissolved oxygen con- 
centration (mg/L). Thus, as the sugar gets oxidized, an 
equivalent amount of oxygen is lost from the tub. Develop 
a MATLAB script using ode45 to numerically compute the 
concentrations of sugar and oxygen as a function of time 
and develop plots of each versus time. Use event to auto- 
matically stop when the oxygen concentration falls below a 
critical oxygen level of 2 mgO,/L. 

23.27 The growth of bacteria from substrate can be repre- 
sented by the following pair of differential equations 


where X = bacterial biomass, t = time (d), Y = a yield coef- 
ficient, knas = maximum bacterial growth rate, S = substrate 
concentration, and k, = half saturation constant. The param- 
eter values are Y = 0.75, kna = 0.3, and k, = 1 x 10% and 
the initial conditions at t = 0 are S(O) = 5 and X(0) = 0.05. 
Note that neither X nor S can fall below zero as negative val- 
ues are impossible. (a) Use ode23 to solve for X and $ from 
t = 0 to 20. (b) Repeat the solution, but set the relative toler- 
ance to 1 x 10%. (c) Keep repeating the solution with the 
relative tolerance set to 1 x 10%, but determine which of the 
MATLAB ode solvers (including the stiff solvers) obtains 
correct (1.e., positive) results. Use the tic and toc functions 
to determine the execution time for each option. 

23.28 The oscillations of a swinging pendulum can be sim- 
ulated with the following nonlinear model: 

a% = -Ë sind 


where 0 = the angle of displacement (radians), g = the gravi- 
tational constant (= 9.81 m/s”), and / = the pendulum length. 
(a) Express this equation as a pair of first-order ODEs. 


(b) Use ode45 to solve for @ and d6/dt as a function of time 
for the case where / = 0.65 m and the initial conditions are 
0 = x/8 and d0/dt = 0. (c) Generate a plot of your results, 
and (d) use the diff function to generate a plot of the angu- 
lar accelerations (d*0/dt*) versus time based on the vector 
of angular velocities (d0/dt) generated in (b). Use subplot to 
display all graphs as a single vertical three-panel plot with 
the top, middle, and bottom plots corresponding to 0, d@/dt, 
and d’6/dt", respectively. 

23.29 A number of individuals have made skydives from 
very high altitudes. Suppose that an 80-kg skydiver launches 
from an elevation of 36.500 km above the earth’s surface. 
The skydiver has a projected area, A = 0.55 m?; and a dimen- 
sionless drag coefficient, C} = 1. Note that the gravitational 
acceleration, g (m/s*), can be related to elevation by 


g = 9.806412 — 0.003039734z 
where z = elevation above the earth’s surface (km) and 


the density of air, p (kg/m°), at various elevations can be 
tabulated as 


z (km) p (kg/m?)| z (km) p (kg/m?) | z (km) p (kg/m?) 
-1 1.3470} 6 0.6601 25 0.04008 
0 1.2250) 7 0.5900 30 0.01841 
1 1.1120} 8 0.5258 40 0.003996 
2 1.0070} 9 0.4671 50 0.001027 
3 0.9093] 10 0.4135 60 0.0003097 
4 0.8194] 15 0.1948 70 8.283 x 10° 
5 0.7364] 20 0.08891) 80 1.846 x 10° 


(a) Based on a force balance between gravity and drag, 
derive differential equations for velocity and distance 
based on a force balance for the skydiver. 

(b) Use a numerical method to solve for velocity and dis- 
tance that terminates when the jumper reaches an eleva- 
tion that is a kilometer above the earth’s surface. 

(c) Plot your results. 

23.30 As depicted in Fig. P23.30, a parachutist jumps from 

an aircraft that is flying in a straight line parallel with the 

ground. (a) Using force balances derive four differential 
equations for the rates of change of the x and y components 

of distances and velocities. [Hint: Recognize that sin 0 = v,/v 

and cos 6 = v,/x]. (b) Employ Ralston’s 2nd-order method 

with At = 0.25 s to generate a solution from ¢ = 0 until the 
parachutist hits the ground assuming that the chute never 
opens. The drag coefficient is 0.25 kg/m, the mass is 80 kg, 
and the ground is 2000 km below the initial vertical position 
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FIGURE P23.30 


of the aircraft. The initial conditions are v, = 135 m/s, and 
v, = x = y = 0. (c) Develop a plot of position on Cartesian 
(x — y) coordinates. 
23.31 The basic differential equation of the elastic curve for 
a cantilever beam (Fig. P23.31) is given as 
2 
mio = 


5 = 


P(L — x) 


where E = the modulus of elasticity and J the moment of 
inertia. Solve for the deflection of the beam using ode45. 
The following parameter values apply: E = 2 x 10'! Pa, 
I = 0.00033 mî, P = 4.5 KN, and L = 3 m. Develop a plot of 
your results along with the analytical solution, 
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FIGURE P23.31 


23.32 The following differential equations define the con- 
centrations of three reactants in a closed system (Fig. P23.32), 


dc 

a = —kyycy + ky Cy +k C3 
dc, 

Ar = k61 — ka Cy — yyy 
dc; 

at ky Cy — kzı 03 


An experiments with initial conditions of c,(0) = 100, and 
c,(0) = c,(0) = 0 yields the following data: 


t 1 2 3 4 5 6 8 9 10 12 15 


c, 85.366.660.656.149.1 45.3 41.937.833.7 34.4 35.1 
c, 16.918.724.120.918.919.9 20.613.919.114.5 15.4 
c, 4.7 7.920.122.8 32.5 37.742.4 47 50.5 52.3 51.3 


Use ode45 to integrate the equations and an optimization 
function to estimate the values of the k’s that minimize the 
sum of the squares of the discrepancies between the model 
predictions and the data. Employ initial guesses of 0.15 for 
all the k’s. 


FIGURE P23.32 
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CHAPTER OBJECTIVES 


The primary objective of this chapter is to introduce you to solving boundary-value 
problems for ODEs. Specific objectives and topics covered are 


Understanding the difference between initial-value and boundary-value problems. 
Knowing how to express an nth-order ODE as a system of n first-order ODEs. 
Knowing how to implement the shooting method for linear ODEs by using linear 
interpolation to generate accurate “shots.” 

Understanding how derivative boundary conditions are incorporated into the 
shooting method. 

Knowing how to solve nonlinear ODEs with the shooting method by using root 
location to generate accurate “shots.” 

Knowing how to implement the finite-difference method. 

Understanding how derivative boundary conditions are incorporated into the 
finite-difference method. 

Knowing how to solve nonlinear ODEs with the finite-difference method by using 
root-location methods for systems of nonlinear algebraic equations. 

Familiarizing yourself with the built-in MATLAB function bvp4c for solving 
boundary-value ODEs. 


YOU’VE GOT A PROBLEM 


o this point, we have been computing the velocity of a free-falling bungee jumper by 
integrating a single ODE: 


dv Ca p} (24.1) 
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Suppose that rather than velocity, you are asked to determine the position of the jumper 
as a function of time. One way to do this is to recognize that velocity is the first derivative 
of distance: 


dx _ 


a =D (24.2) 


Thus, by solving the system of two ODEs represented by Eqs. (24.1) and (24.2), we can 
simultaneously determine both the velocity and the position. 

However, because we are now integrating two ODEs, we require two conditions to 
obtain the solution. We are already familiar with one way to do this for the case where we 
have values for both position and velocity at the initial time: 


x(t = 0) =x; 
v(t = 0) = 0; 


Given such conditions, we can easily integrate the ODEs using the numerical techniques 
described in Chaps. 22 and 23. This is referred to as an initial-value problem. 

But what if we do not know values for both position and velocity at t = 0? Let’s say that 
we know the initial position but rather than having the initial velocity, we want the jumper 
to be at a specified position at a later time. In other words: 


x(t = 0) = x; 
X(t = bp) = Xp 


Because the two conditions are given at different values of the independent variable, this is 
called a boundary-value problem. 

Such problems require special solution techniques. Some of these are related to the 
methods for initial value problems that were described in the previous two chapters. How- 
ever, others employ entirely different strategies to obtain solutions. This chapter is de- 
signed to introduce you to the more common of these methods. 


INTRODUCTION AND BACKGROUND 


24.1.1 What Are Boundary-Value Problems? 


An ordinary differential equation is accompanied by auxiliary conditions, which are used 
to evaluate the constants of integration that result during the solution of the equation. For 
an nth-order equation, n conditions are required. If all the conditions are specified at the 
same value of the independent variable, then we are dealing with an initial-value problem 
(Fig. 24.1a). To this point, the material in Part Six (Chaps. 22 and 23) has been devoted to 
this type of problem. 

In contrast, there are often cases when the conditions are not known at a single point 
but rather are given at different values of the independent variable. Because these values 
are often specified at the extreme points or boundaries of a system, they are customarily 
referred to as boundary-value problems (Fig. 24.1b). A variety of significant engineering 
applications fall within this class. In this chapter, we discuss some of the basic approaches 
for solving such problems. 
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dy, 
aE = IE Yi y2) 
dy 


a = f(t, Yn Yo) 


where att = 0, y; = yı o and y, = Yz 9 
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Initial conditions | Vo 
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(a) 
d’y 
T = f(x, y) 
where atx = 0, y = yo 
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FIGURE 24.1 


Initial-value versus boundary-value problems. (a) An initial-value problem where all th 
conditions are specified at the same value of the independent variable. (b) A boundary-value 
problem where the conditions are specified at different values of the independent variable. 


24.1.2 Boundary-Value Problems in Engineering and Science 


At the beginning of this chapter, we showed how the determination of the position and 
velocity of a falling object could be formulated as a boundary-value problem. For that 
example, a pair of ODEs was integrated in time. Although other time-variable examples 
can be developed, boundary-value problems arise more naturally when integrating in 
space. This occurs because auxiliary conditions are often specified at different positions 
in space. 

A case in point is the simulation of the steady-state temperature distribution for a long, 
thin rod positioned between two constant-temperature walls (Fig. 24.2). The rod’s cross- 
sectional dimensions are small enough so that radial temperature gradients are minimal 
and, consequently, temperature is a function exclusively of the axial coordinate x. Heat is 
transferred along the rod’s longitudinal axis by conduction and between the rod and the 
surrounding gas by convection. For this example, radiation is assumed to be negligible.! 


' We incorporate radiation into this problem later in this chapter in Example 24.4. 


241 INTRODUCTION AND BACKGROUND 649 


x x+ Ax 


FIGURE 24.2 
A heat balance for a differential element of a heated rod subject to conduction and 
convection. 


As depicted in Fig. 24.2, a heat balance can be taken around a differential element of 
thickness Ax as 


0=q(x)A,—- q(x + AXA, + hA(T,, —T) (24.3) 


where g(x) = flux into the element due to conduction [J/(m? - s)]; g(x + Ax) = flux out 
of the element due to conduction [J/(m? - s)]; A, = cross-sectional area [m7] = zr’, r = the 
radius [m]; h = the convection heat transfer coefficient [J/(m? - K - s)]; A, = the element’s 
surface area [m?] = 2arAx; T,, = the temperature of the surrounding gas [K]; and T = the 
rod’s temperature [K]. 

Equation (24.3) can be divided by the element’s volume (zr7A x) to yield 


9 = YI=— I+ AX), 2h (T-T) 
Ax g 
Taking the limit Ax > 0 gives 
__ 44 | 2h 
0= a (T,,-T) (24.4) 
The flux can be related to the temperature gradient by Fourier’s law: 
= pel 
q=-k T (24.5) 


where k = the coefficient of thermal conductivity [J/(s - m - K)]. Equation (24.5) can be dif- 
ferentiated with respect to x, substituted into Eq. (24.4), and the result divided by k to yield, 


2 
o=27 4K (T -T) (24.6) 
dx? 


where h’ = a bulk heat-transfer parameter reflecting the relative impacts of convection and 
conduction [m~] = 2h/(rk). 

Equation (24.6) represents a mathematical model that can be used to compute the tem- 
perature along the rod’s axial dimension. Because it is a second-order ODE, two conditions 
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EXAMPLE 24.1 


are required to obtain a solution. As depicted in Fig. 24.2, a common case is where the 
temperatures at the ends of the rod are held at fixed values. These can be expressed math- 
ematically as 

T(O) =T, 

T(L) =T, 
The fact that they physically represent the conditions at the rod’s “boundaries” is the origin 
of the terminology: boundary conditions. 

Given these conditions, the model represented by Eq. (24.6) can be solved. Because 
this particular ODE is linear, an analytical solution is possible as illustrated in the follow- 
ing example. 


Analytical Solution for a Heated Rod 


Problem Statement. Use calculus to solve Eq. (24.6) for a 10-m rod with h’ = 
0.05 m7[h = 1 J(m?- K - s), r= 0.2 m, k = 200 J/(s - m- K)], T,, = 200 K, and the 
boundary conditions: 


T(0) = 300K T(10) = 400 K 


Solution. This ODE can be solved in a number of ways. A straightforward approach is to 
first express the equation as 


2 
aT _yr=-nT., 
dx 


Because this is a linear ODE with constant coefficients, the general solution can be readily 
obtained by setting the right-hand side to zero and assuming a solution of the form 
T = e™. Substituting this solution along with its second derivative into the homogeneous 
form of the ODE yields 


XVe- h'e* =0 
which can be solved for A = + vh . Thus, the general solution is 
T = Ae“ + Be™ 
where A and B are constants of integration. Using the method of undetermined coefficients 
we can derive the particular solution T = T. Therefore, the total solution is 
T=T,, + Ae” + Be™ 
The constants can be evaluated by applying the boundary conditions 
T,=T,,+ A+B 
T, =T,, + Ae’ + Be™ 
These two equations can be solved simultaneously for 
(T, - T,,)e" — (T, - To) 


et eM 


_ (1, - To) — (T, — Te” 


ehh — ph 


A= 


B 
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Analytical solution for the heated rod. 


Substituting the parameter values from this problem gives A = 20.4671 and B = 79.5329. 
Therefore, the final solution is 


T = 200 + 20.467 1e¥°™ + 79.5329e7 V9.0 (24.7) 


As can be seen in Fig. 24.3, the solution is a smooth curve connecting the two bound- 
ary temperatures. The temperature in the middle is depressed due to the convective heat 
loss to the cooler surrounding gas. 


24.2 


In the following sections, we will illustrate numerical approaches for solving the same 
problem we just solved analytically in Example 24.1. The exact analytical solution will be 
useful in assessing the accuracy of the solutions obtained with the approximate, numerical 
methods. 


THE SHOOTING METHOD 


The shooting method is based on converting the boundary-value problem into an equiva- 
lent initial-value problem. A trial-and-error approach is then implemented to develop a 
solution for the initial-value version that satisfies the given boundary conditions. 

Although the method can be employed for higher-order and nonlinear equations, it is 
nicely illustrated for a second-order, linear ODE such as the heated rod described in the 
previous section: 


2 
0 = +h'(T,,-T) 


24.8 
a (24.8) 


subject to the boundary conditions 

T(0) =T, 

T(L) =T, 

We convert this boundary-value problem into an initial-value problem by defining the 
rate of change of temperature, or gradient, as 


aT _ 


DEZ (24.9) 
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and reexpressing Eq. (24.8) as 


Z -hT = 7) (24.10) 
dx 


Thus, we have converted the single second-order equation (Eq. 24.8) into a pair of first- 
order ODEs [Egs. (24.9) and (24.10)]. 

If we had initial conditions for both T and z, we could solve these equations as an initial- 
value problem with the methods described in Chaps. 22 and 23. However, because we only 
have an initial value for one of the variables T(0) = T, we simply make a guess for the other 
z(0) = z,, and then perform the integration. 

After performing the integration, we will have generated a value of T at the end of the 
interval, which we will call 7,,. Unless we are incredibly lucky, this result will differ from 
the desired result T,. 

Now, let’s say that the value of T, is too high (T,, > T,), it would make sense that a 
lower value of the initial slope z(0) = z,, might result in a better prediction. Using this new 
guess, we can integrate again to generate a second result at the end of the interval T,,.. We 
could then continue guessing in a trial-and-error fashion until we arrived at a guess for z(0) 
that resulted in the correct value of T(L) = T,. 

At this point, the origin of the name shooting method should be pretty clear. Just as you 
would adjust the angle of a cannon in order to hit a target, we are adjusting the trajectory of 
our solution by guessing values of z(0) until we hit our target T(L) = T,. 

Although we could certainly keep guessing, a more efficient strategy is possible for 
linear ODEs. In such cases, the trajectory of the perfect shot z, is linearly related to the 
results of our two erroneous shots (z,,,, Tp) and (Z,», T,2). Consequently, linear interpolation 
can be employed to arrive at the required trajectory: 


2a Zai 
a+ T =T) (24.11) 
; b2 bl á ái 


Zam z 


a 


The approach can be illustrated by an example. 


The Shooting Method for a Linear ODE 


Problem Statement. Use the shooting method to solve Eq. (24.6) for the same condi- 
tions as Example 24.1: L = 10 m, h’ = 0.05 m°, T = 200 K, T(0) = 300 K, and T (10) = 
400 K. 


Solution. Equation (24.6) is first expressed as a pair of first-order ODEs: 


dT _ 
dx“ 


dz _ _9.95(200 — T) 
dx 


Along with the initial value for temperature T (0) = 300 K, we arbitrarily guess a value of 
Za, = —5 K/m for the initial value for z(0). The solution is then obtained by integrating the 
pair of ODEs from x = 0 to 10. We can do this with MATLAB’s ode45 function by first 
setting up an M-file to hold the differential equations: 
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function dy=Ex2402(x,y) 
dy=[y(2);-0.05*(200-y(1))]; 


We can then generate the solution as 


>> [t,y]=ode45(@Ex2402,[0 10], [300,-5]); 
>> Tb1=y(length(y) ) 


Tb1 = 
569.7539 


Thus, we obtain a value at the end of the interval of T,,, = 569.7539 (Fig. 24.4a), which 
differs from the desired boundary condition of T, = 400. Therefore, we make another guess 
Za2 = —20 and perform the computation again. This time, the result of T,, = 259.5131 is 
obtained (Fig. 24.45). 


FIGURE 24.4 
Temperature (K) versus distance (m) computed with the shooting method: (a) the first “shot,” 
(b) the second “shot,” and (c) the final exact “hit.” 
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Now, because the original ODE is linear, we can use Eq. (24.11) to determine the 
correct trajectory to yield the perfect shot: 


—20 — (—5) 


2a = 5 + 559.5131 — 569.7530 


(400 — 569.7539) = —13.2075 


This value can then be used in conjunction with ode45 to generate the correct solution, as 
depicted in Fig. 24.4c. 

Although it is not obvious from the graph, the analytical solution is also plotted on 
Fig. 24.4c. Thus, the shooting method yields a solution that is virtually indistinguishable 
from the exact result. 


EXAMPLE 24.3 


24.2.1 Derivative Boundary Conditions 


The fixed or Dirichlet boundary condition discussed to this point is but one of several 
types that are used in engineering and science. A common alternative is the case where 
the derivative is given. This is commonly referred to as a Neumann boundary condition. 

Because it is already set up to compute both the dependent variable and its deriva- 
tive, incorporating derivative boundary conditions into the shooting method is relatively 
straightforward. 

Just as with the fixed-boundary condition case, we first express the second-order ODE 
as a pair of first-order ODEs. At this point, one of the required initial conditions, whether 
the dependent variable or its derivative, will be unknown. Based on guesses for the missing 
initial condition, we generate solutions to compute the given end condition. As with the 
initial condition, this end condition can either be for the dependent variable or its deriva- 
tive. For linear ODEs, interpolation can then be used to determine the value of the missing 
initial condition required to generate the final, perfect “shot” that hits the end condition. 


The Shooting Method with Derivative Boundary Conditions 


Problem Statement. Use the shooting method to solve Eq. (24.6) for the rod in 
Example 24.1: L = 10 m, h’ = 0.05 m”? [h=1J(m’-K-s), r = 0.2 m, k = 200 J/ 
(s -m - K)], T,, = 200 K, and T(10) = 400 K. However, for this case, rather than hav- 


ing a fixed temperature of 300 K, the left end is subject to convection as in Fig. 24.5. 


FIGURE 24.5 
A rod with a convective boundary condition at one end and a fixed temperature at the other. 
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For simplicity, we will assume that the convection heat transfer coefficient for the end area 
is the same as for the rod’s surface. 


Solution. As in Example 24.2, Eq. (24.6) is first expressed as 


dT _ 
dx 
dz 


— = —0.05(200 — T) 
dx 


Z 


Although it might not be obvious, convection through the end is equivalent to speci- 
fying a gradient boundary condition. In order to see this, we must recognize that because 
the system is at steady state, convection must equal conduction at the rod’s left boundary 
(x = 0). Using Fourier’s law [Eq. (24.5)] to represent conduction, the heat balance at the 
end can be formulated as 


hAAT,, — T(0)) = —kA, a (0) (24.12) 
This equation can be solved for the gradient 

dl @) = 2 (710) - Ta (24.13) 

dx k 


If we guess a value for temperature, we can see that this equation specifies the gradient. 

The shooting method is implemented by arbitrarily guessing a value for T(0). If we 
choose a value of T(0) = T, = 300 K, Eq. (24.13) then yields the initial value for the 
gradient 


— 47 o=- = = 
Za = Fy (0) = 700 (300 — 200) = 0.5 
The solution is obtained by integrating the pair of ODEs from x = 0 to 10. We can do this 
with MATLAB’s ode45 function by first setting up an M-file to hold the differential equa- 
tions in the same fashion as in Example 24.2. We can then generate the solution as 


>> [t,y]=ode45(@Ex2402,[0 10],[300,0.5]); 
>> Tb1=y(length(y) ) 


Tbl = 
683.5088 


As expected, the value at the end of the interval of T,, = 683.5088 K differs from the 
desired boundary condition of T, = 400. Therefore, we make another guess T = 150 K, 
which corresponds to z,, = —0.25, and perform the computation again. 


>> [t,y]=ode45(@Ex2402,[0 10],[150,-0.25]); 
>> Tb2=y(length(y) ) 


Tb2 = 
-41.7544 
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FIGURE 24.6 


The solution of a second-order ODE with a convective boundary condition at one end and a 
fixed temperature at the other. 


Linear interpolation can then be employed to compute the correct initial temperature: 


= 150 — 300 _ B 
T, = 300 + pao 683 3085 400 — 683.5088) = 241.3643 K 


which corresponds to a gradient of z, = 0.2068. Using these initial conditions, ode45 can be 
employed to generate the correct solution, as depicted in Fig. 24.6. 

Note that we can verify that our boundary condition has been satisfied by substituting 
the initial conditions into Eq. (24.12) to give 


J 2 _ __ J 2 K 
1 K z x (0.2 m)“ x (200 K — 241.3643 K) = —200 a eg x (0.2 m)“ x 0.2068 => 


which can be evaluated to yield —5.1980 J/s = —5.1980 J/s. Thus, conduction and convec- 
tion are equal and transfer heat out of the left end of the rod at a rate of 5.1980 W. 


24.2.2 The Shooting Method for Nonlinear ODEs 


For nonlinear boundary-value problems, linear interpolation or extrapolation through two 
solution points will not necessarily result in an accurate estimate of the required bound- 
ary condition to attain an exact solution. An alternative is to perform three applications of 
the shooting method and use a quadratic interpolating polynomial to estimate the proper 
boundary condition. However, it is unlikely that such an approach would yield the exact 
answer, and additional iterations would be necessary to home in on the solution. 

Another approach for a nonlinear problem involves recasting it as a roots problem. Re- 
call that the general goal of a roots problem is to find the value of x that makes the function 
f(x) = 0. Now, let us use the heated rod problem to understand how the shooting method 
can be recast in this form. 

First, recognize that the solution of the pair of differential equations is also a “func- 
tion” in the sense that we guess a condition at the left-hand end of the rod z,, and the inte- 
gration yields a prediction of the temperature at the right-hand end T,. Thus, we can think 
of the integration as 


T, = fa) 
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That is, it represents a process whereby a guess of z, yields a prediction of T,. Viewed in 
this way, we can see that what we desire is the value of z, that yields a specific value of T,. 
If, as in the example, we desire T, = 400, the problem can be posed as 


400 = f(z.) 


By bringing the goal of 400 over to the right-hand side of the equation, we generate a new 
function res(z,) that represents the difference, or residual, between what we have, f(z,), 
and what we want, 400. 


res(z,) =f(z,) — 400 


If we drive this new function to zero, we will obtain the solution. The next example illus- 
trates the approach. 


The Shooting Method for Nonlinear ODEs 


Problem Statement. Although it served our purposes for illustrating the shooting method, 
Eq. (24.6) was not a completely realistic model for a heated rod. For one thing, such a rod 
would lose heat by mechanisms such as radiation that are nonlinear. 

Suppose that the following nonlinear ODE is used to simulate the temperature of the 
heated rod: 


2 
0= Fath Te hae (T4 -T^ 
X 


where o’ = a bulk heat-transfer parameter reflecting the relative impacts of radiation and 
conduction = 2.7 x 10° K m. This equation can serve to illustrate how the shooting 
method is used to solve a two-point nonlinear boundary-value problem. The remaining 
problem conditions are as specified in Example 24.2: L = 10 m, h’ = 0.05 m”, 


T,, = 200 K, T(0) = 300 K, and T(10) = 400 K. 


Solution. Just as with the linear ODE, the nonlinear second-order equation is first ex- 
pressed as two first-order ODEs: 


dT _ 
dx 


oe = —0.05(200 — T) — 2.7 x 10-°(1.6 x 10° — T+) 


Z 


An M-file can be developed to compute the right-hand sides of these equations: 


function dy=dydxn(x,y) 
dy=[y(2);-0.05*(200-y(1))-2.7e-9*(1.6e9-y(1)^4)]; 


Next, we can build a function to hold the residual that we will try to drive to zero as 
function r=res(za) 


[x,y] =ode45(@dydxn,[0 10],[300 za]); 
r=y(length(x) ,1)-400; 
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FIGURE 24.7 
The result of using the shooting method to solve a nonlinear problem. 


Notice how we use the ode45 function to solve the two ODEs to generate the temperature at 
the rod’s end: y(length(x),1). We can then find the root with the fzero function: 


>> fzero(@res ,-50) 


ans = 
-41.7434 


Thus, we see that if we set the initial trajectory z(0) = — 41.7434, the residual function will 
be driven to zero and the temperature boundary condition T (10) = 400 at the end of the rod 
should be satisfied. This can be verified by generating the entire solution and plotting the 
temperatures versus x: 


>> [x,y]=ode45(@dydxn,[0 10],[300 fzero(@res,-50)]); 
>> plot(x,y(:,1)) 


The result is shown in Fig. 24.7 along with the original linear case from Example 24.2. 
As expected, the nonlinear case is depressed lower than the linear model due to the addi- 
tional heat lost to the surrounding gas by radiation. 


24.3 


FINITE-DIFFERENCE METHODS 


The most common alternatives to the shooting method are finite-difference approaches. 
In these techniques, finite differences (Chap. 21) are substituted for the derivatives in the 
original equation. Thus, a linear differential equation is transformed into a set of simultane- 
ous algebraic equations that can be solved using the methods from Part Three. 

We can illustrate the approach for the heated rod model [Eq. (24.6)]: 


2 
ostia -7) (24.14) 
dx 


The solution domain is first divided into a series of nodes (Fig. 24.8). At each node, finite- 
difference approximations can be written for the derivatives in the equation. For example, 
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__ 


FIGURE 24.8 
In order to implement the finite-difference approach, the heated rod is divided into a series 
of nodes. 


at node i, the second derivative can be represented by (Fig. 21.5): 
LT _ Ti — 2T, + Ta 
dx’ Ax? 

This approximation can be substituted into Eq. (24.14) to give 


(24.15) 


Tı- 2T; +T, 
i—1 : i+] +4 h'(T. a T,) =0 
Ax 
Thus, the differential equation has been converted into an algebraic equation. Collecting 
terms gives 


-T,_, + (2 +kAxÐT, -T = hx To (24.16) 


This equation can be written for each of the n — 1 interior nodes of the rod. The first and 
last nodes T, and T, „ respectively, are specified by the boundary conditions. Therefore, the 
problem reduces to solving n — 1 simultaneous linear algebraic equations for the n — 1 
unknowns. 

Before providing an example, we should mention two nice features of Eq. (24.16). 
First, observe that since the nodes are numbered consecutively, and since each equation 
consists of a node (i) and its adjoining neighbors (i — 1 and i + 1), the resulting set of lin- 
ear algebraic equations will be tridiagonal. As such, they can be solved with the efficient 
algorithms that are available for such systems (recall Sec. 9.4). 

Further, inspection of the coefficients on the left-hand side of Eq. (24.16) indicates that 
the system of linear equations will also be diagonally dominant. Hence, convergent solutions 
can also be generated with iterative techniques like the Gauss-Seidel method (Sec. 12.1). 


Finite-Difference Approximation of Boundary-Value Problems 
Problem Statement. Use the finite-difference approach to solve the same problem as in 
Examples 24.1 and 24.2. Use four interior nodes with a segment length of Ax = 2 m. 
Solution. Employing the parameters in Example 24.1 and Ax = 2 m, we can write 
Eq. (24.16) for each of the rod’s interior nodes. For example, for node 1: 

-Tọ + 2.2T, — T, = 40 
Substituting the boundary condition Tọ = 300 gives 

2.2T, — T, = 340 


660 


BOUNDARY-VALUE PROBLEMS 


After writing Eq. (24.16) for the other interior nodes, the equations can be assembled in 
matrix form as 


22 -1 0 0 i 340 
-1 22 -1 O}J7|_ | 40 


© =) 32 11% 40 
0 0 -1 221 (mJ 440 


Notice that the matrix is both tridiagonal and diagonally dominant. 
MATLAB can be used to generate the solution: 


>> A=[2.2 -1 0 0; 

-1 2.2 -1 0; 

0 -1 2.2 -1; 

0 0 -1 2.2]; 

>> b=[340 40 40 440]'; 
>> T=A\b 


ile a= 
283.2660 
283.1853 
299.7416 
336.2462 


Table 24.1 provides a comparison between the analytical solution (Eq. 24.7) and the 
numerical solutions obtained with the shooting method (Example 24.2) and the finite- 
difference method (Example 24.5). Note that although there are some discrepancies, the 
numerical approaches agree reasonably well with the analytical solution. Further, the biggest 
discrepancy occurs for the finite-difference method due to the coarse node spacing we used 
in Example 24.5. Better agreement would occur if a finer nodal spacing had been used. 


TABLE 24.1 Comparison of the exact analytical solution for temperature with the results 
obtained with the shooting and finite-difference methods. 


Analytical Shooting Finite 
x Solution Method Difference 
0 300 300 300 
2 282.8634 282.8889 283.2660 
4 282.5775 282.6158 283.1853 
6 299.0843 299.1254 299.7416 
8 335.7404 335.7718 336.2462 
10 400 400 400 


24.3.1 Derivative Boundary Conditions 


As mentioned in our discussion of the shooting method, the fixed or Dirichlet boundary 
condition is but one of several types that are used in engineering and science. A common 


24.3 FINITE-DIFFERENCE METHODS 661 


FIGURE 24.9 
A boundary node at the left end of a heated rod. To approximate the derivative at the 
boundary, an imaginary node is located a distance Ax to the left of the rod’s end. 


alternative, called the Neumann boundary condition, is the case where the derivative is 
given. 

We can use the heated rod introduced earlier in this chapter to demonstrate how a 
derivative boundary condition can be incorporated into the finite-difference approach: 


2 

=F tTa T) 
x 

However, in contrast to our previous discussions, we will prescribe a derivative boundary 

condition at one end of the rod: 


aT wy _ 7 
dx ()=T, 


T(L) =T, 


Thus, we have a derivative boundary condition at one end of the solution domain and a 
fixed boundary condition at the other. 

Just as in the previous section, the rod is divided into a series of nodes and a finite- 
difference version of the differential equation (Eq. 24.16) is applied to each interior node. 
However, because its temperature is not specified, the node at the left end must also 
be included. Fig. 24.9 depicts the node (0) at the left edge of a heated plate for which the 
derivative boundary condition applies. Writing Eq. (24.16) for this node gives 


-T_,+(2+h' AxTy)—T, =h ALT, (24.17) 


Notice that an imaginary node (—1) lying to the left of the rod’s end is required for 
this equation. Although this exterior point might seem to represent a difficulty, it actually 
serves as the vehicle for incorporating the derivative boundary condition into the problem. 
This is done by representing the first derivative in the x dimension at (0) by the centered 
difference [Eq. (4.25)]: 


dT _ yt 
dx 2Ax 


which can be solved for 


aT 
Ti = Ti = 2Ax re 
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Now we have a formula for T_, that actually reflects the impact of the derivative. It can be 
substituted into Eq. (24.17) to give 


(2 + h’Ax’)Ty — 2T, = A’ Ax T, — 2Ax ar (24.18) 
Consequently, we have incorporated the derivative into the balance. 

A common example of a derivative boundary condition is the situation where the end 
of the rod is insulated. In this case, the derivative is set to zero. This conclusion follows 
directly from Fourier’s law [Eq. (24.5)], because insulating a boundary means that the heat 
flux (and consequently the gradient) must be zero. The following example illustrates how 
the solution is affected by such boundary conditions. 


Incorporating Derivative Boundary Conditions 


Problem Statement. Generate the finite-difference solution for a 10-m rod with Ax = 2 m, 
h' = 0.05 m°, T = 200 K, and the boundary conditions: T4 = 0 and T, = 400 K. Note that 
the first condition means that the slope of the solution should approach zero at the rod’s left 
end. Aside from this case, also generate the solution for dT/dx = —20 at x = 0. 


Solution. Equation (24.18) can be used to represent node 0 as 
2.2T, — 2T, = 40 

We can write Eq. (24.16) for the interior nodes. For example, for node 1, 
—T, + 2.2T, — T, = 40 


A similar approach can be used for the remaining interior nodes. The final system of equa- 
tions can be assembled in matrix form as 


22. T, 40 
=] 32 =] T, 40 
=. 22 =] T,4=4 40 

af, 2 cai IT 40 

-1 221 |7,) (440 


These equations can be solved for 


T, = 243.0278 
T, = 247.3306 
T, = 261.0994 
T, = 287.0882 
T, = 330.4946 


As displayed in Fig. 24.10, the solution is flat at x = 0 due to the zero derivative condition 
and then curves upward to the fixed condition of T = 400 at x = 10. 
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FIGURE 24.10 
The solution of a second-order ODE with a derivative boundary condition at one end and a fixed 
boundary condition at the other. Two cases are shown reflecting different derivative values atx = 0. 


For the case where the derivative at x = 0 is set to —20, the simultaneous equations are 


2.2 —2 To 120 

-1 2.2 -1 T, 40 

-1 2.2 -1 T,¢ = 40 

-1 22 -1 T; 40 

—1 2.2 T; 440 

which can be solved for 

Ty = 328.2710 
T, = 301.0981 
T, = 294.1448 
T, = 306.0204 
T, = 339.1002 


As in Fig. 24.10, the solution at x = 0 now curves downward due to the negative derivative 
we imposed at the boundary. 


24.3.2 Finite-Difference Approaches for Nonlinear ODEs 


For nonlinear ODEs, the substitution of finite differences yields a system of nonlinear 
simultaneous equations. Thus, the most general approach to solving such problems is to 
use root-location methods for systems of equations such as the Newton-Raphson method 
described in Sec. 12.2.2. Although this approach is certainly feasible, an adaptation of suc- 
cessive substitution can sometimes provide a simpler alternative. 

The heated rod with convection and radiation introduced in Example 24.4 provides a 
nice vehicle for demonstrating this approach, 


Z 
0-2 4 nr, -T) +0" (TE =T 
dx 
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We can convert this differential equation into algebraic form by writing it for a node i and 
substituting Eq. (24.15) for the second derivative: 


T = 2T, + Pit 


0= 
Ax? 


+h' (To —T,) + o"(T$ - T} 


Collecting terms gives 


-T,_,+(2+NAxX)T, -T =h ArT, + 0 Ax (TS — T?) 


l 


Notice that although there is a nonlinear term on the right-hand side, the left-hand side 
is expressed in the form of a linear algebraic system that is diagonally dominant. If we as- 
sume that the unknown nonlinear term on the right is equal to its value from the previous 
iteration, the equation can be solved for 


WAXT,, +0 Ax (TS -T/) +T + Ta 
T= 
F 2+h' Ax 


(24.19) 


As in the Gauss-Seidel method, we can use Eq. (24.19) to successively calculate the tem- 
perature of each node and iterate until the process converges to an acceptable tolerance. 
Although this approach will not work for all cases, it converges for many ODEs derived 
from physically based systems. Hence, it can sometimes prove useful for solving problems 
routinely encountered in engineering and science. 


The Finite-Difference Method for Nonlinear ODEs 


Problem Statement. Use the finite-difference approach to simulate the temperature of a 
heated rod subject to both convection and radiation: 


ü= 4 wer, Tyto TET 
dx 
where o' = 2.7 x 10° K°m”, L = 10 m, W = 0.05 m°, T,, = 200K, T(0) = 300 K, 
and T (10) = 400 K. Use four interior nodes with a segment length of Ax = 2 m. Recall that 
we solved the same problem with the shooting method in Example 24.4. 


Solution. Using Eq. (24.19) we can successively solve for the temperatures of the rod’s 
interior nodes. As with the standard Gauss-Seidel technique, the initial values of the 
interior nodes are zero with the boundary nodes set at the fixed conditions of T, = 300 and 
T; = 400. The results for the first iteration are 


r, = 0-05? 200 + 2.7 x 10-*(2)7(200* — 0*) + 300 + 0 
' 2 + 0.05(2) 


"E 0.05(2)? 200 + 2.7 x 10? (2)2(200* — 04) + 159.2432 + 0 
2 2 + 0.05(2} 


= 159.2432 


= 97.9674 
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FIGURE 24.11 
The filled circles are the result of using the finite-difference method to solve a nonlinear 
problem. The line generated with the shooting method in Example 24.4 is shown for 

comparison. 


_ 0.05(2)* 200 + 2.7 x 10” (2)2(200* — 0*) + 97.9674 + 0 
2 + 0.05(2) 


T, = 70.4461 


_ 0.05(2)? 200 + 2.7 x 10” (2)7(200* — 0*) + 70.4461 + 400 
2 + 0.05(2) 


The process can be continued until we converge on the final result: 


T, = 226.8704 


T, = 300 
T, = 250.4827 
T, = 236.2962 
T, = 245.7596 
T, = 286.4921 
T; = 400 


These results are displayed in Fig. 24.11 along with the result generated in Example 24.4 
with the shooting method. 


24.4 


MATLAB FUNCTION: bvp4c 


The bvp4c function solves ODE boundary-value problems by integrating a system of ordi- 
nary differential equations of the form y’ = f(x, y) on the interval [a, b], subject to general 
two-point boundary conditions. A simple representation of its syntax is 


sol = bvp4c(odefun, bcfun, solinit) 


where sol = a structure containing the solution, odefun = the function that sets up the ODEs 
to be solved, bcfun = the function that computes the residuals in the boundary conditions, 
and solinit = a structure with fields holding an initial mesh and initial guesses for the 
solution. 
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The general format of odefun is 
dy = odefun(x,y) 


where x= a scalar, y= a column vector holding the dependent variables [y,; y>], and dy =a 
column vector holding the derivatives [dy,; dy]. 
The general format of bcfun is 


res = bcfun(ya,yb) 


where ya and yb = column vectors holding the values of the dependent variables at the 
boundary values x = a and x = b, and res = a column vector holding the residuals between 
the computed and the specified boundary values. 

The general format of solinit is 


solinit = bypinit(xmesh, yinit); 


where bvpinit = a built-in MATLAB function that creates the guess structure holding the 
initial mesh and solution guesses, xmesh = a vector holding the ordered nodes of the initial 
mesh, and yinit = a vector holding the initial guesses. Note that whereas your choices for 
the initial mesh and guesses will not be of great importance for linear ODEs, they can often 
be critical for efficiently solving nonlinear equations. 


Solving a Boundary-Value Problem with bvp4c 


Problem Statement. Use bvp4c to solve the following second-order ODE 


dy 
— +y=l1l 
dx? 4 
subject to the boundary conditions 
yO) = 1 
y(x/2) = 0 
Solution. First, express the second-order equation as a pair of first-order ODEs 
dy _ 
dx =Z 
dz_ 4 _ 
dx hay 


Next, set up a function to hold the first-order ODEs 
function dy = odes(x,y) 
dy = [y(2); 1-y(1)]; 


We can now develop the function to hold the boundary conditions. This is done just 
like a roots problem in that we set up two functions that should be zero when the boundary 
conditions are satisfied. To do this, the vectors of unknowns at the left and right boundaries 
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are defined as ya and yb. Hence, the first condition, y(0) = 1, can be formulated as ya(1) -1; 
whereas the second condition, y(z/2) = 0, corresponds to yb(1). 


function r = bcs(ya,yb) 
r = [ya(1)-1; yb(1)]; 


Finally, we can set up solinit to hold the initial mesh and solution guesses with the 
bypinit function. We will arbitrarily select 10 equally spaced mesh points, and initial 
guesses of y = 1 and z = dy/dx =-1. 


solinit = bvpinit(linspace(0,pi/2,10),[1,-1]); 
The entire script to generate the solution is 


clc 

solinit = bvpinit(linspace(0,pi/2,10),[1,-1]); 
sol = bvp4c(@odes ,@bcs,solinit); 

x = linspace(0,pi/2); 

y = deval(sol,x); 

plot(x,y(1,:)) 


where deval is a built-in MATLAB function which evaluates the solution of a differential 
equation problem with the general syntax 


yxint = deval(sol,xint) 


where deval evaluates the solution at all the values of the vector xint, and so7 is the struc- 
ture returned by the ODE problem solver (in this case, bvp4c). 

When the script is run the plot below is generated. Note that the script and functions 
developed in this example can be applied to other boundary value problems with minor modi- 
fications. Several end-of-chapter problems are included to test your ability to do just that. 
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PROBLEMS 


24.1 A steady-state heat balance for a rod can be repre- 
sented as 
LT 


S- 0.1ST=0 
X 


Obtain a solution for a 10-m rod with T(0) = 240 and 
T(10) = 150 (a) analytically, (b) with the shooting method, 
and (c) using the finite-difference approach with Ax = 1. 
24.2 Repeat Prob. 24.1 but with the right end insulated and 
the left end temperature fixed at 240. 

24.3 Use the shooting method to solve 


with the boundary conditions y(0) = 5 and y(20) = 8. 

24.4 Solve Prob. 24.3 with the finite-difference approach 
using Ax = 2. 

24.5 The following nonlinear differential equation was 
solved in Examples 24.4 and 24.7. 


-ËT 
~ 9 


0 +h'(T <The -T^ (24.5) 


dx 


Such equations are sometimes linearized to obtain an ap- 
proximate solution. This is done by employing a first-order 
Taylor series expansion to linearize the quartic term in the 
equation as 


o'T* = 0'T* + 40'T(T -T) 


where T is a base temperature about which the term is linear- 
ized. Substitute this relationship into Eq. (P24.5), and then 
solve the resulting linear equation with the finite-difference 
approach. Employ T = 300, Ax = 1 m, and the parameters 
from Example 24.4 to obtain your solution. Plot your re- 
sults along with those obtained for the nonlinear versions in 
Examples 24.4 and 24.7. 

24.6 Develop an M-file to implement the shooting method 
for a linear second-order ODE. Test the program by dupli- 
cating Example 24.2. 

24.7 Develop an M-file to implement the finite-differ- 
ence approach for solving a linear second-order ODE 
with Dirichlet boundary conditions. Test it by duplicating 
Example 24.5. 

24.8 An insulated heated rod with a uniform heat source can 
be modeled with the Poisson equation: 


LT 
dx? 


=-f¢) 


Given a heat source f(x) = 25 °C/m? and the boundary con- 
ditions T(x = 0) = 40 °C and T(x = 10) = 200 °C, solve for 
the temperature distribution with (a) the shooting method and 
(b) the finite-difference method (Ax = 2). 

24.9 Repeat Prob. 24.8, but for the following spatially vary- 
ing heat source: f(x) = 0.12x7 — 2.4x? + 12x. 

24.10 The temperature distribution in a tapered conical 
cooling fin (Fig. P24.10) is described by the following dif- 
ferential equation, which has been nondimensionalized: 


du + (2) (ds — pu) = 
de + ade pu) =0 
where u = temperature (0 < u < 1), x = axial distance 


(0 < x < 1), and p is a nondimensional parameter that de- 
scribes the heat transfer and geometry: 


AL fy, 4_ 
= Me h + 
4 k yi 2m? 


where h = a heat transfer coefficient, k = thermal conductiv- 
ity, L = the length or height of the cone, and m = the slope 
of the cone wall. The equation has the boundary conditions: 


ux=0)=0 u&œ=1)=1 


Solve this equation for the temperature distribution using 
finite-difference methods. Use second-order accurate finite- 
difference formulas for the derivatives. Write a computer 


ee e= i) 


FIGURE P24.10 


PROBLEMS 


669 


program to obtain the solution and plot temperature versus 
axial distance for various values of p = 10, 20, 50, and 100. 
24.11 Compound A diffuses through a 4-cm-long tube and 
reacts as it diffuses. The equation governing diffusion with 
reaction is 


At one end of the tube (x = 0), there is a large source of A 
that results in a fixed concentration of 0.1 M. At the other 
end of the tube there is a material that quickly absorbs any A, 
making the concentration 0 M. If D = 1.5 x 107° cm?/s and 
k=5 x 10~°s7!, what is the concentration of A as a function 
of distance in the tube? 

24.12 The following differential equation describes the 
steady-state concentration of a substance that reacts with 
first-order kinetics in an axially dispersed plug-flow reactor 
(Fig. P24.12): 

dc dc 
D Zz U Prig kc =0 


where D = the dispersion coefficient (m7/hr), c = concentra- 
tion (mol/L), x = distance (m), U = the velocity (m/hr), and 
k = the reaction rate (/hr). The boundary conditions can be 
formulated as 


Uc, = Uc(x = 0) — D ££ (x = 0) 
dx 

deer 

T (x=L)=0 


where c,, = the concentration in the inflow (mol/L), L = 
the length of the reactor (m). These are called Danckwerts 
boundary conditions. 

Use the finite-difference approach to solve for con- 
centration as a function of distance given the following 
parameters: D = 5000 m?/hr, U = 100 m/hr, k = 2/hr, 
L = 100 m, and c, = 100 mol/L. Employ centered finite- 
difference approximations with Ax = 10 m to obtain your 


solutions. Compare your numerical results with the analytical 
solution: 


Ucin 


(U — DA )a,e2" — (U — DM )Ae™ t 


gb hx 
xX Aner en — 


C= 


ete”) 


24.13 A series of first-order, liquid-phase reactions create 
a desirable product (B) and an undesirable byproduct (C): 
ko ky 
A>B>C 
If the reactions take place in an axially dispersed plug-flow 
reactor (Fig. P24.12), steady-state mass balances can be 
used to develop the following second-order ODEs: 


d'e, dc, 
ie Va kc, =0 
d? d 
DTP UGE thu hea=0 
d’c. dc, 
de = Ut kc, = 0 


Use the finite-difference approach to solve for the concen- 
tration of each reactant as a function of distance given: D = 
0.1 m/min, U = 1 m/min, k; = 3/min, k, = l/min, L = 
0.5 m, Cain = 10 mol/L. Employ centered finite-difference 
approximations with Ax = 0.05 m to obtain your solutions 
and assume Danckwerts boundary conditions as described 
in Prob. 24.12. Also, compute the sum of the reactants as a 
function of distance. Do your results make sense? 

24.14 A biofilm with a thickness L,(cm), grows on the sur- 
face of a solid (Fig. P24.14). After traversing a diffusion 
layer of thickness L (cm), a chemical compound A diffuses 
into the biofilm where it is subject to an irreversible first- 
order reaction that converts it to a product B. 


FIGURE P24.12 
An axially dispersed plug-flow reactor. 
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Bulk Diffusion Solid 
liquid layer Biofilm surface 
ag A v A y A y A > 
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0 


FIGURE P24.14 


A biofilm growing on a solid surface. 


Steady-state mass balances can be used to derive the 
following ordinary differential equations for compound A: 


d’c, 
ie O<x<L 
y? 
dc, 
Drg a L<x<L+L; 
X 


where D = the diffusion coefficient in the diffusion layer = 
0.8 cm?/d, D; = the diffusion coefficient in the biofilm = 
0.64 cm?/d, and k = the first-order rate for the conversion 
of A to B = 0.1/d. The following boundary conditions hold: 


Ca = Ca0 atx=0 
dc, 
k atx=L+L; 


where c,ọ = the concentration of A in the bulk liquid = 
100 mol/L. Use the finite-difference method to compute the 
steady-state distribution of A from x = 0 to L + L, where 
L = 0.008 cm and L; = 0.004 cm. Employ centered finite 
differences with Ax = 0.001 cm. 

24.15 A cable is hanging from two supports at A and B 
(Fig. P24.15). The cable is loaded with a distributed load 
whose magnitude varies with x as 


where w, = 450 N/m. The slope of the cable (dy/dx) = 0 at 
x = 0, which is the lowest point for the cable. It is also the 
point where the tension in the cable is a minimum of T,. The 


FIGURE P24.15 
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FIGURE P24.16 


differential equation which governs the cable is 


d’y _w, | [IX 

ae = T, |: + sin (=) 
Solve this equation using a numerical method and plot the 
shape of the cable (y versus x). For the numerical solution, 
the value of T, is unknown, so the solution must use an itera- 
tive technique, similar to the shooting method, to converge 
on a correct value of h, for various values of T,. 
24.16 The basic differential equation of the elastic curve for 
a simply supported, uniformly loaded beam (Fig. P24.16) is 


given as 


where E = the modulus of elasticity and J = the moment of 
inertia. The boundary conditions are y(0) = y(L) = 0. Solve 
for the deflection of the beam using (a) the finite-differ- 


Ground surface 


ence approach (Ax = 0.6 m) and (b) the shooting method. 
The following parameter values apply: E = 200 GPa, J = 
30,000 cm*, w = 15 kN/m, and L = 3 m. Compare your 
numerical results to the analytical solution: 


wLx? wx" wL>x 


12EI 24EI 24EI 


y= 


24.17 In Prob. 24.16, the basic differential equation of the 
elastic curve for a uniformly loaded beam was formulated as 


pË — WLx _ wx? 
dx? 2 2 


Note that the right-hand side represents the moment as a 
function of x. An equivalent approach can be formulated in 
terms of the fourth derivative of deflection as 


For this formulation, four boundary conditions are required. 
For the supports shown in Fig. P24.16, the conditions are 
that the end displacements are zero, y(0) = y(L) = 0, and 
that the end moments are zero, y”(0) = y”(L) = 0. Solve for 
the deflection of the beam using the finite-difference ap- 
proach (Ax = 0.6 m). The following parameter values apply: 
E = 200 GPa, J = 30,000 cm*, w = 15 kN/m, and L = 3 m. 
Compare your numerical results with the analytical solution 
given in Prob. 24.16. 

24.18 Under a number of simplifying assumptions, the 
steady-state height of the water table in a one-dimensional, 
unconfined groundwater aquifer (Fig. P24.18) can be 


Infiltration 


Confining bed 


FIGURE P24.18 
An unconfined or “phreatic” aquifer. 
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modeled with the following second-order ODE: 
E 
KAL 4N=0 
dx’ 


where x = distance (m), K = hydraulic conductivity (m/d), 
h = height of the water table (m), A = the average height of 
the water table (m), and N = infiltration rate (m/d). 

Solve for the height of the water table for x = 0 to 

1000 m where (0) = 10 m and h(1000) = 5 m. Use the 
following parameters for the calculation: K = 1 m/d and 
N= 0.0001 m/d. Set the average height of the water table as 
the average of the boundary conditions. Obtain your solution 
with (a) the shooting method and (b) the finite-difference 
method (Ax = 100 m). 
24.19 In Prob. 24.18, a linearized groundwater model was 
used to simulate the height of the water table for an uncon- 
fined aquifer. A more realistic result can be obtained by 
using the following nonlinear ODE: 


d dh 
4 (kh @)+N=0 
where x = distance (m), K = hydraulic conductivity (m/d), 
h = height of the water table (m), and N = infiltration 
rate (m/d). Solve for the height of the water table for the 
same case as in Prob. 24.18. That is, solve from x = 0 to 
1000 m with h(0) = 10 m, A(1000) = 5 m, K = 1 m/d, 
and N = 0.0001 m/d. Obtain your solution with (a) the 
shooting method and (b) the finite-difference method 
(Ax = 100 m). 

24.20 Just as Fourier’s law and the heat balance can be 
employed to characterize temperature distribution, analo- 
gous relationships are available to model field problems 
in other areas of engineering. For example, electrical engi- 
neers use a similar approach when modeling electrostatic 
fields. Under a number of simplifying assumptions, an ana- 
log of Fourier’s law can be represented in one-dimensional 
form as 


where D is called the electric flux density vector, € = permit- 
tivity of the material, and V = electrostatic potential. Simi- 
larly, a Poisson equation (see Prob. 24.8) for electrostatic 
fields can be represented in one dimension as 


PV _ Pv 
de E 


where p, = charge density. Use the finite-difference tech- 
nique with Ax = 2 to determine V for a wire where V(0) = 
1000, V(20) = 0, e = 2, L = 20, and p, = 30. 


24.21 Suppose that the position of a falling object is gov- 
erned by the following differential equation: 


where c = a first-order drag coefficient = 12.5 kg/s, m = mass = 
70 kg, and g = gravitational acceleration = 9.81 m/s. Use 
the shooting method to solve this equation for the boundary 
conditions: 


x(0) =0 
x(12) = 500 


24.22 As in Fig. P24.22, an insulated metal rod has a 
fixed temperature (T) boundary condition at its left end. 
On it right end, it is joined to a thin-walled tube filled with 
water through which heat is conducted. The tube is insu- 
lated at its right end and convects heat with the surrounding 
fixed-temperature air (T). The convective heat flux at a lo- 
cation x along the tube (W/m?) is represented by 


J conv = A(T, = T,(x)) 


where h = the convection heat transfer coefficient 
[W/(m? - K)]. Employ the finite-difference method with 
Ax = 0.1 m to compute the temperature distribution for the 
case where both the rod and tube are cylindrical with the 
same radius r (m). Use the following parameters for your 
analysis: Loa = 0.6 m, Li. = 0.8 m, Ty = 400 K, T,, = 300 K, 
r=3 cm, p; = 7870 kg/m’, Cy = 447 J/(kg - K), k, = 80.2 
W/(m - K), p, = 1000 kg/m?, Cy. = 4.18 kJ/(kg - K), k, = 
0.615 W/(m - K), and h = 3000 W/(m? - K). The subscripts 
designate the rod (1) and the tube (2). 

24.23 Perform the same calculation as in Prob. 24.22, but 
for the case where the tube is also insulated (i.e., no con- 
vection) and the right-hand wall is held at a fixed boundary 
temperature of 200 K. 

24.24 Solve the following problem with the bvp4c, 


Teg 
| i 
70 l = ae | ae ae ae l 
a ees 
= Lioa “+ Lube -i 


FIGURE P24.22 
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subject to the following boundary conditions 
(0) = 1 
dy 4) 
Zo=0 
24.25 Figure P24.25a shows a uniform beam subject to a 


linearly increasing distributed load. The equation for the re- 
sulting elastic curve is (see Fig. P24.25b) 
dy w Wo 


3 
are (ix 7) =0 
Note that the analytical solution for the resulting elastic 
curve is (see Fig. P24.25b) 


EI 


9 _(_y5 4.97743 — 74x) 


= OFT IL 
Use bvp4c to solve for the differential equation for the elastic 
curve for L = 600 cm, E = 50,000 kN/cm?, J = 30,000 cm’, 
and Wy = 2.5 kN/cm. Then, plot both the numerical (points) 
and the analytical (lines) solutions on the same graph. 
24.26 Use bvp4c to solve the boundary-value ordinary dif- 
ferential equation 


x ʻ + o% —u=2 
with boundary conditions u(0) = 10 and u(2) = 1. Plot the 
results of u versus x. 
24.27 Use bvp4c to solve the following nondimensionalized 
ODE that describes the temperature distribution in a circular 
rod with internal heat source S 


@T iar 


eatr g tS=0 


over the range 0 < r < 1, with the boundary conditions 


T1)=1 


dT (ny _ 
FO=0 


@=L,y=0) 


(6) 


FIGURE P24.25 


for S = 1, 10, and 20 K/m”. Plot the temperature versus ra- 
dius for all three cases on the same graph. 

24.28 A heated rod with a uniform heat source can be mod- 
eled with the Poisson equation, 


ay 
dx? 


= —f@) 


Given a heat source f(x) = 25 and the boundary conditions, 
T (0) = 40 and T(10) = 200, solve for the temperature distri- 
bution with (a) the shooting method, (b) the finite-difference 
method, and (c) bvp4c. 

24.29 Repeat Prob. 24.28, but for the following heat source: 
f(x) = 0.12x7 — 2.4x? + 12x. 
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abs, 40 

acos, 40 

ascii, 64 

axis, 51 

axis square, 45 
beep, 76 

besselj, 451 
ceil, 40 

chol, 285 

clabel, 212 
clear, 64 

cond, 296, 297 
contour, 212, 566 
conv, 186 
cumtrapz, 510, 511 
deconv, 185 

det, 253 

diag, 338 

diff, 560, 561, 562 
disp, 61 

double, 61 

eig, 336 

elfun, 40 

eps, 112 

erf, 544 

error, 66 

event, 622 

exp, 40 

eye, 336 
factorial, 51, 73n3 
fft, 420 

fix, 459 


floor, 40, 212 
fminbnd, 137, 210, 211 
fminsearch, 213, 215, 199 
format bank, 30 
format compact, 28n1 
format long, 30, 102 
format long e, 30, 300 
format long eng, 30 
format long g, 30 
format loose, 28n1 
format short, 30 
format short e, 30, 300 
format short eng, 30 
format short g, 30 
fplot, 82 

fprintf, 62, 63 

fzero, 137, 181-183 
getframe, 77, 78 
gradient, 563, 564 
grid, 42 

help, 39, 46 

help elfun, 40 

hist, 353 

hold off, 44 

hold on, 44 

humps, 94, 540 

inline, 82 

input, 61 

interp1, 470 

interp2, 475 

interp3, 475 

inv, 236, 238 


isempty, 105 
legend, 398, 511 
length, 37 
LineWidth, 44 
linspace, 84 
load, 64, 65 

log, 39 

10g10, 371 

log2, 150n2 
loglog, 51 
logspace, 34 
lookfor, 46, 55 
lu, 282 
MarkerEdgeColor, 44 
MarkerFaceColor, 44 
MarkerSize, 44 
max, 41, 262 
mean, 57, 352 
median, 352 
mesh, 76 
meshgrid, 212, 566 
min, 41, 352 
mode, 352 

movie, 77-78 
nargin, 71-72 
norm, 296 
ode113, 618 
ode15s, 632 
ode23, 617 
ode23s, 632 
ode23t, 632 
ode23tb, 632 
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ode45, 618, 633 
odeset, 620 

ones, 32 

optimset, 182, 183, 189 
pause, 76 

pchip, 468, 471 

peaks, 572 

pi, 30 

plot, 42 

plot3, 44, 608 

poly, 184, 185 
polyfit, 373, 433, 446 
polyval, 373, 433, 446 
prod, 41 

quiver, 565 

rand, 353-356 

randn, 353, 356 


realmax, 112 
realmin, 112 
roots, 184-187 
round, 40 

save, 63 

semi logy, 50, 51 
set, 479 

sign, 69 

sin, 40 

size, 237 

sort, 41 

spline, 468 
sqrt, 40 

sqrtm, 40 

std, 352 

stem, 423 
subplot, 44 


sum, 40, 285 
surfc, 212 
tanh, 7, 35 
tic, 76 
title, 42 
toc, 76 
trapz, 510, 518 
var, 352 
varargin, 86 
who, 32 
whos, 32 
xlabel, 42 
ylabel, 42 
ylim, 424 
zeros, 32 
zlabel, 212 
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M-file Name Description Page 
bisect Root location with bisection 151 
eulode Integration of a single ordinary differential equation with Euler’s method 586 
fzerosimp Brent’s method for root location 180 
Gaussllaive Solving linear systems with Gauss elimination without pivoting 258 
GaussPivot Solving linear systems with Gauss elimination with partial pivoting 263 
GaussSeidel Solving linear systems with the Gauss-Seidel method 310 
goldmin inimum of one-dimensional function with golden-section search 208 
incsearch Root location with an incremental search 144 
IterMeth General algorithm for iterative calculation 105 
Lagrange nterpolation with the Lagrange polynomial 443 
linregr Fitting a straight line with linear regression 372 
natspline Cubic spline with natural end conditions 477 
Newtint nterpolation with the Newton polynomial 440 
newtmult Root location for nonlinear systems of equations 318 
newtraph Root location with the Newton-Raphson method 174 
quadadapt Adaptive quadrature 539 
rk4sys ntegration of system of ODEs with 4th-order RK method 602 
romberg ntegration of a function with Romberg integration 530 
TableLook Table lookup with linear interpolation 458 
trap ntegration of a function with the composite trapezoidal rule 500 
trapuneq ntegration of unequispaced data with the trapezoidal rule 509 
Tridiag Solving tridiagonal linear systems 266 
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Simulink® is a graphical programming environment for modeling, simulating, and analyz- 
ing dynamic systems. In short, it allows engineers and scientists to build process models 
by interconnecting blocks with communication lines. Thus, it provides an easy-to-use com- 
puting framework to quickly develop dynamic process models of physical systems. Along 
with offering a variety of numerical integration options for solving differential equations, 
Simulink includes built-in features for graphical output which significantly enhance visu- 
alization of a system’s behavior. 

As a historical footnote, back in the Pleistocene days of analog computers (aka the 
1950s), you had to design information flow diagrams that showed graphically how multiple 
ODEs in models were interconnected with themselves and with algebraic relationships. The 
diagrams also showed flaws in modeling where there was information lacking or structural 
defects. One of the nice features of Simulink is that it also does that. It’s often beneficial 
to see that aspect separate from numerical methods and then merge the two in MATLAB. 

As was done with Chap. 2, most of this appendix has been written as a hands-on exer- 
cise. That is, you should read it while sitting in front of your computer. The most efficient 
way to start learning Simulink is to actually implement it on MATLAB as you proceed 
through the following material. 

So let’s get started by setting up a simple Simulink application to solve an initial value 
problem for a single ODE. A nice candidate is the differential equation we developed for 
the velocity of the free-falling bungee jumper in Chap. 1, 

Bg ty (C.1) 
where v = velocity (m/s), t = time (s), g = 9.81 m/s’, c, = drag coefficient (kg/m), and m = 
mass (kg). As in Chap. 1 use c} = 0.25 kg/m, m = 68.1 kg, and integrate from 0 to 12 s with 
an initial condition of v = 0. 

To generate the solution with Simulink, first launch MATLAB. You should eventu- 
ally see the MATLAB window with an entry prompt, >, in the Command Window. After 
changing MATLAB’s default directory, open the Simulink Library Browser using one of 
the following approaches: 


e On the MATLAB toolbar, click the Simulink button (E8). 
e At the MATLAB prompt, enter the simulink command. 
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The Simulink Library Browser window should appear displaying the Simulink block li- 
braries installed on your system. Note, to keep the Library Browser above all other win- 
dows on your desktop, in the Library Browser select View, Stay on Top. 


r js | 
BS Simulink Library Browser agam 
File Edit View Help 
AO » Enter search term -Ha 
Lert S |[ brary: Simun | searen Resuts: none) | Freauenty usea | | 
aP [| 
Commonly Used Blocks E Commonly Used N Continuous 
D is co i u be s WF. 
Discrete Pa Discontinuities E Discrete | 
Logic and Bi Operations — 
Lookup Tables | 
Math Operations Logic and Bit [ee] Lickin Tobias 
Model Verification =| 
Model-Wide Utilities ~ | 
Ports & Subsystems Math R) Model 
Signal Attributes Operations S Verification 
Signal Routing 
Sinks ModelWide Ports & 
Sources [=] Utilities Ra Subsystems 
User-Defined Functions 
> Additional Math & Discrete he? i m= 
HDL Coder For Signal Attributes Signal Routing 
b Simulink 30 Animation — E 
Simulink Coder a ee 
> Pal Simuink Extras kani ro 
see z User Defined =] Additional Math 


Click on the New model command button on the left of the toolbar, and an untitled 
Simulink model window should appear. 


r 
Py untitled 


D- =- 


a — Diagram Simulation Analysis Code Tools Help 


-B 4 p ¢> 


a@-B- os 


@ ~ vo 
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You will build your simulation model in the untitled window by choosing items from 
the Library Browser and then dragging and dropping them onto the untitled window. First, 
select the untitled window to activate it (clicking on the title bar is a sure way to do this) 
and select Save As from the File menu. Save the window as Freefall in the default direc- 
tory. This file is saved automatically with an .slx extension. As you build your model in 
this window, it is a good idea to save it frequently. You can do this in three ways: the Save 
button, the Ctrl+s keyboard shortcut, and the menu selections, File, Save. 

We will start by placing an integrator element (for the model’s differential equation) 
in the Freefall window. To do this, you need to activate the Library Browser and double 
click on the Commonly Used Blocks item. 


E 


Commonly 
Used Blocks 


The Browser window should show something like 


Simulink/Commonly Used Blocks Ubraries §3 
E ae fate ae) l 
Bus Bus Constant Data Type Conversion Delay 
Creator Selector 

Klis 
$ ib >œ B CG» 
Demux Discrete-Time Gan Ground Int 
Integrator 
1 > | > 
> m Fo R 
Integrator Logical Mux Outi Product 
Operator 
=] 
x} JE G È ak ģ& 
Relational Saturation Scope Subsystem Sum 
Operator 
> 
uh ? 
Switch Terminator Vector 
Concatenate 


Examine the icon window until you see the Integrator icon. 


Input port Output port 


a 


Jip 


Integrator 


The block symbol is what will appear in your model window. Notice how the icon has 
both input and output ports which are used to feed values in and out of the block. The 1/s 
symbol represents integration in the Laplace domain. Use the mouse to drag an Integrator 
icon onto your Freefall window. This icon will be used to integrate the differential 
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equations. Its input will be the differential equation [the right hand side of Eq. (C.1)] and 
its output will be the solution (in our example, velocity). 

Next you have to build an information flow diagram that describes the differential 
equation and “feeds” it into the integrator. The first order of business will be to set up 
constant blocks to assign values to the model parameters. Drag a Constant icon from the 
Commonly Used Blocks’ branch in the Browser window to the Freefall model window 
and place it above and to the left of the integrator. 

Next, click on the Constant Label and change it to g. Then, double click the Constant 
block and the Constant Block Dialogue box will be opened. Change the default value in the 
Constant field to 9.81 and click OK. 

Now set up Constant blocks for both the drag coefficient (cd = 0.25) and mass (m = 
68.1) positioned below the g block. The result should look like 


9.81) 


0.25) 1p 


cd Integrator 


68.1) 


From the Math Operations Library Browser, select a Sum icon and drag it just to the 
left of the Integrator block. Place the mouse pointer on the output port of the sum block. 
Notice that the mouse pointer will change to a crosshair shape when it’s on the output port. 
Then drag a connecting line from the output port to the Integrator block’s input port. As 
you drag, the mouse pointer retains its crosshair shape until it is on the input port where- 
upon it changes to a double-lined crosshair. We have now “wired” the two blocks together 
with the output of the sum block feeding into the Integrator. 

Notice that the sum block has two input ports into which can be fed two quantities that 
will be added as specified by the two positive signs inside the circular block. Recall that our 
differential equation consists of the difference between two quantities: g — (c,/m)v*. We 
therefore, have to change one of the input ports to a negative. To do this, double click on the 
Sum Block in order to open the Sum Block Dialogue box. Notice that the List of Signs has 
two plus signs (++). By changing the second to a minus sign (+ —), the value entering the 
second input port will be subtracted from the first. After closing the Sum Block Dialogue 
box, the result should look like 


1 
E 


ntegrator 


! All the icons in the Commonly Used Blocks group are available in other groups. For example, the Constant 
icon is located in the Sources group. 
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Because the first term in the differential equation is g, wire the output of the g block to 
the positive input port of the sum block. The system should now look like 


9.81 
Integrator 
0.25) 
cd 
68.1) 
m 


In order to construct the second term that will be subtracted from g, we must first 
square the velocity. This can be done by dragging a Math Functions block from the Math 
Operations Library Browser and positioning it to the right and below the integrator block. 
Double click the Math Functions icon to open the Math Functions Dialogue box and use 
the pull down menu to change the Math Function to square. 


R 
$h Function Block Parameters: Math Function 
Math 


Mathematical functions including logarithmic, exponential, power, and 
modulus functions. When the function has more than one argument, the 
first argument corresponds to the top (or left) input port. 


Before connecting the Integrator and Square blocks, it would be nice to rotate the latter 
so its input port is on top. To do this, select the Square block and then hit ctrl-r once. Then, 
we can wire the output port of the Integrator block to the input port of the square block. 
Because the output of the Integrator block is the solution of the differential equation, the 
output of the square block will be v’. To make this clearer, double click the arrow connect- 
ing the Integrator and Square blocks and a text box will appear. Add the label, v(t), in this 
text box to indicate that the output of the Integrator block is the velocity. Note that you can 
label all the connecting wires in this way in order to better document the system diagram. 
At this point, it should look like 
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9.81 
g 1 v(t) 
2 Í a 
Integrator 
0.25) 
cd 
Math} > 
68: Function |" 
4 
m 
Next, drag a Divide block from the Math Operations Library Browser and position it 


to the right of the cd and m blocks. Notice that the Divide block has two input ports: one 
for the dividend (x) and one for the divisor (+). Note that these can be switched by double 
clicking the Divide block to open the Divide Block Dialogue box and switching the order 
in the “number of inputs:” field. Wire the cd block output port to the x input port and the 
m block output port to the + input port of the Divide block. The output port of the divide 
block will now carry the ratio, c,/m. 


9.81 
g a 1 v(t) 
>| _ | S 
Integrator 
0.25 


cd Li X 
> 


Divide 
68.1 ae Math 


Function 


Drag a Product block from the Math Operations Library Browser and position it to the 
right of the divide block and just below the sum block. Use ctrl-r to rotate the product block 
until its output port points upward toward the sum block. Wire the divide block output port 
to the nearest product input port and the square Math Function block output port to the other 
product input port. Finally, wire the product output port to the remaining sum input port. 


9.81 
= S 
Integrator 
0.25 
a =| x | Product 
Y 
Divide 
ad | VAN 
Function 
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As displayed above, we have now successfully developed a Simulink program to gen- 
erate the solution to this problem. At this point, we could run the program but we have not 
yet set up a way to display the output. For the present case, a simple way to do this employs 
a Scope Block 


y 


Scope 


The Scope block displays signals with respect to simulation time. If the input signal is 
continuous, the Scope draws a point-to-point plot between major time step values. Drag a 
Scope block from the Commonly Used Blocks browser and position it to the right of the 
Integrator block. Position the mouse pointer on the Integrator’s output wire (for the present 
case a nice position would be at the corner). Simultaneously hold down the control key and 
another line across to the Scope block’s input port. 


9.81 
= S 
Integrator Scope 

0.25 
en A XxX | Product 

Y 

Divide 
68.1 | Math er: 
Function 

m 


We are now ready to generate results. Before doing that, it’s a good idea to save the 
model. Double click on the Scope block. Then click the run button, @. If there are any 
mistakes, you will have to correct them. Once your have successfully made corrections, the 
program should execute and the scope display should look something like 


r ad 
4A Scope 


Time offset. 0 
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Click on the Autoscale button, EJ, and the plot will resize to fit the entire range of results 


r 


Note that a Scope window can display multiple y-axes (graphs) with one graph per 
input port. All of the y-axes have a common time range on the x-axis. By selecting the 
parameter button on the graph window (@), you can use scope parameters to change graph 
features such as figure color and style and axis settings. 
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A 
Absolute error, 101 
Absolute tolerance (AbsTol), 618 
Accuracy, 100-101 
Adaptive integration, 487 
Adaptive methods 
adaptive Runge-Kutta methods, 
615-623 
MATLAB application, 634—635 
multistep methods, 624-628 
Pliny’s intermittent fountain, 
635-639 
Adaptive quadrature, 525, 537. See also 
Gauss quadrature; Romberg 
integration 
example, 540 
integral function, 539 
quadadapt function, 537-539 
Adaptive Runge-Kutta methods, 615. 
See also Runge-Kutta methods 
(RK methods) 
adaptive step-size control, 615-616 
events, 621—623 
MATLAB functions for nonstiff 
systems, 617-621 
solution of ODE, 616 
Adaptive step-size control, 615-616 
Allosteric enzymes, 374 
Alphanumeric information, 34 
Amplification factor, 585 
Amplitude, 406 
Analytical solution, 9 
for heated rod, 650-651 
Angular frequency, 332, 406, 407 
Animation, 77 
of projectile motion, 77-78 
Anonymous functions, 81-82 


Areal integral, 492 
Arithmetic manipulations of computer 
numbers, 112 
adding large and small number, 114 
inner products, 114 
large computations, 113 
smearing, 114 
Arithmetic mean, 348 
Arrays, 31-32 
operations, 39 
Ascent methods, 213 
ASCII files in MATLAB, 64-65 
Assignment function, 29 
arrays, 31-32 
character strings, 34-35 
colon operator, 33 
linspace functions, 33—34 
logspace functions, 33-34 
matrices, 31-32 
scalars, 29-30 
vectors, 31-32 
Associative equation, 232 
Aug function, 258 
Augmentation, 234 


B 
Backslash operator, 394 
Backward method, 630 
Banded matrix, 231, 264 
Base-2. See Binary 
Base-8. See Octal 
Bias. See Inaccuracy 
Bilinear function, 474 
Binary, 107 

digits, 106 

search, 458-459 
bisect function, 151-152 


Bisection method, 146, 147. See also 
Numerical methods 
bisect function, 151-152 
error estimation for, 148-151 
preferable to false position, 154-155 
Bits, 106 
Blunders, 130 
Boole’s rule, 538 
Boolean variables, 636 
Boundary-value problems, 577, 647 
in engineering and science, 648-651 
finite-difference methods, 658—665 
initial-value problem vs., 647, 648 
MATLAB function, 665-667 
ordinary differential equation, 647 
shooting method, 651—658 
single ODE, 646 
Bracketing methods. See also Numerical 
methods; Roots 
and initial guesses, 141 
incremental search, 143-146 
Brent’s method, 176. See also Secant 
methods 
algorithm, 179-181 
inverse quadratic interpolation, 
177-179 
Brent’s root-finding method, 179 
Brent’s root-location method, 176 
Built-in functions, 39-42 
Bungee jumper 
with cord, 634-635 
problem 
analysis, 291-292 
analytical solution to, 7—10 
free-falling, 573 
numerical solution to, 10-12 
velocity, 87-90 
687 
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Butterfly effect, 607 
bvp4c function, 665-667 


Cc 

Calculator mode, 2, 28 

Calculus, 549 

Cartesian coordinates, 492 

ceil function, 40 

Centered finite difference, 123 

Chaos, 604—609 

Character Strings, 34-35 

Characteristic polynomial, 328 

Chemical reactions, 320-322 

chol function, 285-286 

Cholesky decomposition. See also 
Cholesky factorization 

Cholesky factorization, 283 
example, 284 
MATLAB function, 285-286 

Circuits, currents and voltages in, 
240-243 

Clamped end condition, 467 

Classical fourth-order RK method, 
595-598 

Closed integration formulas, 487, 494 

Closed-form solution, 9 

Coefficient matrix, 251 

Colebrook equation, 186 

Colon operator, 33 

Column, 229 
column-sum norm, 294 
vectors, 230 

Command mode, 2 

Commutative equation, 231, 232 

Companion matrix, 184 

Complete pivoting, 261 

Composite Simpson’s 1/3 rule, 
503-505, 524-525 

Composite trapezoidal rule, 497—499 

Computer algorithm for iterative calcu- 
lations, 104-106 

Computer applications, 370 
MATLAB functions, polyfit and 

polyval, 373 

MATLAB M-file, linregr, 371-372 

Computer mathematics, 1 

Computer number representation, 106 


floating-point representation, 108—111 


integer representation, 107—108 
precision, 111—112 


range, 111 
Computing work with numerical 
integration, 515-518 
Concatenation, 32, 34 
Conditionally stable method, 585 
Conservation 
of charge, 240 
of energy, 241 
laws, 223 
in engineering and science, 12-13 
Constant of integration, 575-576 
Constitutive laws, 551, 552 
Continuity condition, 460 
Continuous Fourier series, 411 
approximation, 411—412 
Euler’s formula, 413 
square wave, 412—413 
Control codes, 62-63 
Convergence, 164, 308 
Cooley-Tukey algorithm, 420 
Corrector equation, 588 
Correlation coefficient, 364 
Cosine, 406 
function, 407 
Cramer’s rule, 249, 250-253, 315 
Creating and accessing files, 63—65 
Cubic interpolation, 439 
Cubic splines, 453, 462. See also 
Linear splines 
derivation, 463—465 
end conditions, 467—468 
material on, 455 
natural, 466-467 
cumtrapz functions, 510 
Currents in circuits, 240—243 
Curvature, 550 
Curve fitting 
with sinusoidal functions, 405, 407 
alternative formulation, 408 
least-squares fit of sinusoid, 
408-411 
plot of sinusoidal function, 406 
techniques, 343 
through data points, 344 
engineering and science, 343-345 
linear regression, 345 


D 
Darcy’s law, 552 
Data uncertainty, 131 


Decisions, 65 
error function, 66 
if structure, 65—66, 69-70 
if... else structure, 69 
if... elseif structure, 69 
logical conditions, 66—68 
switch structure, 70-71 
variable argument list, 71-72 
Default value, 71 
Definite integrals equation, 575 
Deflation, 336 
Degrees of freedom, 349 
Dependent variable, 5, 573-574 
Derivative, 485, 549, 550 
boundary conditions 
finite-difference methods, 
660-662 
shooting method with, 654—656 
for data with errors, 558-559 
mean-value theorem, 119 
of unequally spaced data, 557-558 
Descent methods, 213 
Descriptive statistics, 348 
location measurement, 348-349 
in MATLAB, 352-353 
spread measurement, 349 
statistics of sample, 350-351 
det function, 253 
Determinants, 250-253 
evaluation with gauss elimination, 
263-264 
DFT. See Discrete Fourier transform 
(DFT) 
Diagonal dominance, 308 
Diagonal matrix, 230 
diff function, 560-562 
Differential equations, 7, 573, 574 
Differentiation, 485—487, 549-551 
in engineering and science, 551-552 
one-dimensional forms of 
constitutive laws, 552 
Direct methods, 213 
Dirichlet boundary condition, 654 
Discrete Fourier transform (DFT), 418 
FFT, 419-420 
MATLAB function, 420-423 
of simple sinusoid with MATLAB, 
421-422 
Discretization errors, 583 
disp function, 61 
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Distributed variable problems, 225 
Distributive, 232 
Dot product. See Inner product of two 
vectors 
Double integral, 512-513 
to determine average temperature, 
513-514 
Dummy variable, 89 


E 


Earthquakes, eigenvalues and, 337-339 
Echo printing, 29 
Eigenvalues, 327, 328 
and earthquakes, 337-339 
eig function, 336 
mass-three spring system, 331 
with MATLAB, 336-337 
physical background, 331 
physical interpretation of, 332-333 
polynomial method, 329-331 
positions and velocities vs. 
time, 327 
power method, 333-336 
Eigenvectors, 226, 329 
with MATLAB, 336 
physical interpretation of, 332-333 
Electromotive force (emf), 241 
Element-by-element operations, 39 
Elimination of unknowns, 253-254 
Ellipsis, 34 
Embedded RK methods, 616, 617 
emf. See Electromotive force (emf) 
End conditions, 467—468 
Enzyme kinetics, 373-378 
Equilibrium, 213-215 
error function, 66 
Error(s), 100 
absolute, 101 
accuracy, 100-101 
amplification, 559 
analysis for Euler’s method, 583-585 
blunders, 130 
computer algorithm for iterative 
calculations, 104—106 
data uncertainty, 131 
estimation, 627—628 
for iterative methods, 103-104 
model errors, 130-131 
numerical, 101 
percent relative, 102 


precision, 100-101 
roundoff, 106-114 
total numerical error, 125—130 
truncation, 114—125 

Euclid’s definition, 203 

Euclidean norm, 293 

Euler-Cauchy method. See Euler’s 
method 

Euler’s formula, 413 

Euler’s method, 10, 13,577, 581, 582, 
598, 629-630. See also Runge-Kutta 
methods (RK methods) 
error analysis for, 583—585 
example, 582-583 
fundamental source of error in, 587 
Heun’s method, 588-592 
MATLAB M.-file function, 

586-587 
midpoint method, 592-593 
solving systems of ODEs with, 
598-600 

stability of, 585-586 

eulode function, 586-587 

Events, 621-623 

Explicit Euler method, 630-632 

Exploratory data analysis, 46—48 

Exponential model, 366 

Extrapolation, 445-447. See also 
Interpolation 

“Eyeball” approaches, 358 


F 
Factorial, loops to computing, 73-74 
False position, 152, 177, 433-435, 470 
bisection method preferable to, 
154-155 
method, 153-154 
False-position formula, 152 
Fast Fourier transform (FFT), 345, 
419-420 
fft function, 420-423 
Fick’s law, 552 
50th percentile. See Median 
Finite difference, 121 
approximations 
of derivatives, 124-125 
of higher derivatives, 125 
methods, 658 
of boundary-value problems, 
659-660 


derivative boundary conditions, 
660-662 
implementation, 659 
incorporating derivative boundary 
conditions, 662—663 
for nonlinear ODEs, 663-665 
First derivative 
backward difference approximation, 
123 
centered difference approximation, 
123 
First-order 
approximation, 116 
equation, 574 
method, 585 
splines, 456-457 
Fitting experimentatal data, 397-399 
Floating-point operations (flops), 258 
Floating-point representation, 108-111 
floor function, 40 
flops. See Floating-point operations 
(flops) 
fmin corresponds, 207 
fminbnd function, 210—211 
fminsearch function, 213 
fminsearch MATLAB function, 395 
for... end structure, 72—73 
Forcing functions, 5 
Format codes, 62 
Forward elimination of unknowns, 
255-256 
Fourier analysis, 345, 405. See also 
Gauss elimination 
continuous Fourier series, 411-413 
curve fitting with sinusoidal 
functions, 405-411 
DFT, 418-423 
Fourier integral and transform, 
415-418 
frequency and time domains, 
414-415 
power spectrum, 423-424 
sunspots, 425—426 
Fourier coefficients, 423 
Fourier integral, 415 
amplitude and phase line spectra, 417 
aperiodic signal, 418 
various phases of sinusoid, 416 
Fourier series, 415 
continuous, 411-413 
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Fourier transform, 415—418 
pair, 417 
Fourier’s law, 552, 655 
of heat conduction, 551 
Fourth-order RK method, 600-601 
fprintf function, 62 
Free-falling bungee jumper problem, 
573. 
Frequency domain, 414-415 
Frequency plane, 415 
Friction factor, 186 
Frobenius norm, 294 
fsolve function, 319-320 
Function files, 55—57 
Function functions, 81, 82—83 
building and implementing, 84-85 
Fundamental frequency, 411 
fval function, 213 
fzero function, 181-183 
fzerosimp function, 179 


G 


Gauss elimination, 225, 249. See also 
Fourier analysis 
determinants and Cramer’s rule, 
250-253 
elimination of unknowns, 253-254 
graphical method, 249-250 
as LU factorization, 276 
example, 278 
MATLAB function, 282-283 
matrix, 277 
with pivoting, 280-282 
substitution steps, 279-280 
model of heated rod, 266-269 
naive Gauss elimination, 
254-261 
pivoting, 261-264 
solving small numbers of equations, 
249 
tridiagonal systems, 264—266 
Gauss quadrature, 487, 508, 525, 
530-531. See also Adaptive 
quadrature; Romberg integration 
higher-point formulas, 536-537 
three-point Gauss-Legendre formula, 
536-537 
two-point Gauss-Legendre formula, 
533-536, 543 


undetermined coefficients method, 
531-533 
Gauss-Newton method, 395 
Gauss-Seidel method, 226, 305-312, 
664 
with relaxation, 311-312 
GaussNaive function, 258 
GaussPivot function, 262-263 
Gear backward differentiation formulas, 
632 
General linear least-squares regression, 
391-392. See also Nonlinear 
regression 
model, 345 
multiple linear regression, 
389-391 
polynomial regression, 385-389 
with MATLAB, 392-393 
QR factorization and backslash 
operator, 394 
Global optimum, 202 
Global truncation error, 583-584 
Global variables, 58—60 
Golden ratio, 203 
Golden-section search, 203—208 
Gradien methods, 213 
gradient function, 563-564 
Graphical methods, 140-141, 249-250 
Graphics, 42-45 


H 
Harmonics, 411 
Heated rod model, 266-269 
Hertz (Hz), 332 
Heun’s method, 588 
example, 589-592 
without iteration, 595 
predictor and corrector, 588 
predictor equation, 588 
predictor-corrector approach, 589 
High-accuracy differentiation formulas, 
552 
example, 553-555 
Taylor series, 552-553 
Higher-order corrections, 527 
Higher-order differential equations, 
574 
Higher-order polynomial interpolation, 
dangers of, 447—449 
Higher-point formulas, 536-537 


Hilbert matrix, 295 

Histogram, 351-352 

Homogeneous linear algebraic system, 
328 

Hooke’s law, 213, 552, 580 

humps function, 540 

Hypothesis testing, 344 


| 
Identity matrix, 230 
IEEE double-precision format, 111 
if structure, 65—66, 69-70 
if... else structure, 69 
if... elseif structure, 69 
Ill-conditioned ODEs, 585 
Ill-conditioned systems, 249 
Implicit Euler method, 630-632 
Implicit method, 630 
Implicit Runge-Kutta formula, 632 
Imprecision, 100 
Inaccuracy, 100 
Increment function, 581, 593 
Incremental search method, 143-146 
Indefinite integral equation, 575 
Indentation, 79-81 
Independent variables, 5, 573-574 
Indoor air pollution, 297—300 
Infinite loop, 75 
Initial-value problems, 577, 579, 647 

Euler’s method, 581-593 

MATLAB M-file function, rk4sys, 

602-603 
predator-prey models and chaos, 
604-609 

RK methods, 593-598 

signum function, 580 

solving ODE, 581 

systems of equations, 598—604 
Inner product of two vectors, 37 
input function, 61 
Input-output, 61 

creating and accessing files, 63—65 

interactive M-file function, 62—63 
Integer representation, 107—108 
integral function, 539, 543 

integral2 functions, 514 

integral3 functions, 514 
Integrals, 492 

for data with errors, 558—559 
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Integration, 485-487, 489-490 
in engineering and science, 490-492 
Newton-Cotes open integration 
formulas, 512 
numerical integration to computing 
distance, 510-511 
trapezoidal rule with unequal 
segments, 508 
trapuneq function, 509 
trapz and cumtrapz, 510 
with unequal segments, 508 
Interactive M-file function, 62—63 
interp! function, 468, 470 
options, 472 
trade-offs using, 471-473 
Interpolation, 16, 343, 345, 430. See also 
Polynomial interpolation 
determining polynomial coefficients, 
431-432 
polyfit functions, 433 
polyval functions, 433 
Inverse Fourier transform, 417, 418 
Inverse interpolation, 444-445. See also 
Interpolation 
Inverse quadratic interpolation, 
177-179 
Iterative methods 
linear systems, 305-312 
nonlinear systems, 312-322 


J 

Jacobi iteration, 307 
Joule’s law, 541 
Jumpers motion, 326 


K 

Kirchhoff s current, 240 
Kirchhoff s voltage rule, 241 
Knots, 457 


L 

Lagging phase angle, 407 

Lagrange interpolating polynomial, 345, 
441. See also Newton interpolating 
polynomial 
Lagrange function, 443—444 
rationale, 442 

Lagrange polynomial, 178 

Leading phase angle, 407 


Least squares 
criterion, 360 
fit of sinusoid, 408-411 
regression, 343 
Left division, polynomial regression 
implemention with, 394 
Line spectra, 415 
Linear algebraic equations, 223, 227, 
248 
bungee cords, 228 
currents and voltages in circuits, 
240-243 
in engineering and science, 
223-225 
free-body diagrams, 228 
linear algebraic equations with 
MATLAB, 238-240 
matrix algebra, 229-238 
Linear convergence, 166 
Linear interpolation method. See False 
position 
Linear Lagrange interpolating 
polynomial, 441 
Linear least-squares regression, 358 
“criteria for best” fit, 358—360 
error quantification of linear 
regression, 362 
errors estimation for linear 
least-squares fit, 364-366 
least-squares fit of straight line, 360 
linear regression, 360-362 
linear regression with residual errors, 
364 
regression data, 363 
residual in linear regression, 
362-363 
Linear ODE, shooting method for, 
652-654 
Linear regression, 345, 360-362, 370 
computer applications, 370-373 
descriptive statistics, 348-351 
in MATLAB, 352-353 
enzyme kinetics, 373-378 
experimental data for force and 
velocity, 347-348 
fluid mechanics, 346-347 
linear least-squares regression, 
358-366 
linearization of nonlinear 
relationships, 366-370 


normal distribution, 351—352 
random numbers and simulation, 
353-358 
statistics review, 348 
wind tunnel experiment, 347 
Linear splines, 455. See also Cubic 
splines 
first-order splines, 456-457 
notation for, 456 
table lookup, 458-459 
Linear systems, 305. See also Nonlinear 
systems 
chemical reactions, 320—322 
convergence and diagonal 
dominance, 308 
fsolve function, 319-320 
Gauss-Seidel method, 306-308 
GaussSeidel function, 309 
linear algebraic equations, 305-306 
relaxation, 309-312 
Linearization of nonlinear relationships, 
366 
comments on linear regression, 370 
fitting data with power equation, 
368-370 
nonlinear regression techniques, 
367-368 
power equation, 366-367 
linregr MATLAB M-file function, 
371 
linspace functions, 33-34 
Local optimum, 202 
Local truncation error, 583—584 
Local variables, 57, 58 
Logical conditions, 66-68 
Logical variables, 636 
logspace functions, 33-34 
Loop(s), 65, 72 
to computing factorial, 73-74 
for... end structure, 72—73 
pause command, 76 
preallocation of memory, 74-75 
rule, 241 
vectorization, 74 
while structure, 75 
while... break structure, 75—76 
Lorenz equations, 604, 606 
Lotka-Volterra equations, 604, 607 
Lotka-Volterra model, 606 
Lower triangular matrix, 231 
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LU factorization, 225, 275 
Cholesky factorization, 283-286 
with Gauss elimination, 278 
Gauss elimination as, 276-283 
LU function, 282-283 
MATLAB left division, 286 
n-dimensional systems, 275 
two-step strategy, 276 

Lumped drag coefficient, 7 

Lumped variable problems, 224 


M 
M-files, 54, 143, 637 
bisect function, 151-152 
eulode function, 586-587 
function files, 55-57 
GaussNaive, 258 
GaussPivot, 262-263 
GaussSeidel, 309 
global variables, 58—60 
to implementing Lagrange 
interpolation, 443 
to implementing Newton 
interpolation, 440 
to implementing Romberg 
integration, 530 
Lagrange function, 443-444 
linregr M-file function, 371-372 
Newtint function, 440 
newtraph function, 173-174 
passing functions to, 81 
anonymous functions, 81-82 
building and implementing 
function function, 84-85 
function functions, 82-83 
passing parameters, 85-87 
quadadapt function, 537-539 
tk4sys, 602-603 
script files, 54—55 
subfunctions, 60—61 
trap function, 499-501 
trapuneq function, 509 
Tridiag function, 266 
variable scope, 57-58 
Machine epsilon, 110 


Machine precision. See Machine epsilon 


Maclaurin series expansion, 103 
Main function, 61 
Mantissa, 110 


Mathematical model, 2, 5 


analytical solution to bungee jumper 
problem, 7—10 

numerical solution to bungee jumper 
problem, 10-12 

real drag, 17-19 


Mathematical operations, 36-39 
MATLAB 


assignment function, 29-35 

bisect function, 151—152 

built-in functions, 674—675 

bvp4c function, 665—667 

chol function, 285-286 

Cholesky factorization with, 
285-286 

cumtrapz functions, 510 

descriptive statistics in, 352-353 

DFT of simple sinusoid with, 
421-422 

eig function, 336 

eigenvalues, 336-337 

eigenvectors, 336-337 

environment, 28—29 

eulode, 586-587 

exploratory data analysis, 46—48 

fft function, 420-423 

fminsearch, 395 

fsolve function, 319—320 

fzero function, 181-183 

GaussNaive, 258 

GaussPivot, 262-263 

GaussSeidel, 309 

graphics, 42-45 

to implement Gauss-Seidel method, 
310 

integral function, 539 

integral2 and integral3 functions, 514 

Lagrange function, 443-444 

left division, 286 

linear algebraic equations with, 
238-240 

linregr function, 371-372 

LU factorization, 282-283 

M-file functions, 676 

mathematical operations, 36-39 

matrix manipulations, 234—237 

multidimensional interpolation in, 
475 

Newtint function, 440 

newtraph function, 173-174 


numerical differentiation with, 
560-564 
piecewise interpolation in, 468 
interp1 function, 470-473 
spline function, 468—470 
polyfit functions, 373, 433 
polyval functions, 373, 433 
power spectrum with, 424 
programming mode, 3 
quadadapt function, 537-539 
rand function, 354-356 
randn function, 356-358 
resources, 46 
tk4sys, 602-603 
roots function, 184-186 
software environment, 2 
for stiffness, 632-634 
trap, 499-501 
trap function, 499-501 
trapuneq function, 509 
trapz functions, 510 
Tridiag function, 266 
use of built-in functions, 39-42 
variables, 57 
workspace, 57 
Matrix, 31-32, 229 
algebra, 225, 229 
condition number, 294-296 
in MATLAB, 296-297 
form, 409 
linear algebraic equations in matrix 
form, 237-238 
matrix algebra, 229-238 
multiplication, 231-232 
norms, 293-294 
notation, 229-231 
operating rules, 231 
higher-dimensional matrices, 
233 
MATLAB matrix manipulations, 
234-237 
visual depiction, 232 
Matrix inverse, 226, 288 
bungee jumper problem analysis, 
291-292 
calculating inverse, 288-289 
error analysis and system condition, 
292 
matrix condition number, 
294-296 
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methods, 292-293 
vector and matrix norms, 
293-294 
example, 289-290 
indoor air pollution, 297—300 
norms and condition number in 
MATLAB, 296-297 
stimulus-response computations, 
290-291 
max function, 262 
Maximum likelihood principle, 363 
Mean value, 406 
Measurement errors, 131 
Median, 348 
Michaelis-Menten equation, 373-374 
Midpoint method, 592-593, 595 
Midtest loop, 75 
Minimax criterion, 359 
Minimum potential energy, 213-215 
Minors, 251 
Mixed partial derivative, 560 
Mode, 348 
Model errors, 130-131 
Modified secant method, 175—176 
Multidimensional interpolation, 473. 
See also Piecewise interpolation; 
Polynomial interpolation 
bilinear interpolation, 473-475 
in MATLAB, 475 
Multidimensional optimization, 
211-213 
Multimodal cases, 202 
Multiple integrals, 512 
double integral, 512-514 
MATLAB functions: integral2 and 
integral3 functions, 514 
Multiple linear regression, 345, 
389-391 
Multistep methods, 581, 624 
error estimation, 627—628 
non-self-starting Heun method, 
624-627 


N 
Naive Gauss elimination, 254, 261 
back substitution, 256 
example, 257 
forward elimination of unknowns, 
255-256 


MATLAB M-file, 258 
operation counting, 258-261 
Natural cubic splines, 466—467 
Nearest neighbor interpolation, 470 
Nesting, 79-81 
Neumann boundary condition, 654 
Newtint function, 440-441 
Newton interpolating polynomial, 433. 
See also Lagrange interpolating 
polynomial 
form of, 437—439 
linear interpolation, 433—435 
Newtint function, 440-441 
quadratic interpolation, 435-437 
Newton linear-interpolation formula, 
433 
Newton-Cotes 
closed integration formulas, 507 
formulas, 486, 492-494, 512, 525, 
541-542 
integration formulas, 593 
open integration formulas, 512 
Newton-Raphson bungee jumper 
problem, 173—174 
Newton-Raphson method, 169, 170, 314 
MATLAB M-file, 173-174, 318-319 
multivariable Taylor series, 314-315 
near-zero slope, 171 
for nonlinear system, 315-318 
poor convergence, 172 
slowly converging function, 170-171 
Newton’s interpolating polynomial, 345 
Newton’s second law, 12, 228, 326, 405, 
551573 
Newton’s viscosity law, 552 
newtraph function, 173-174 
Non-self-starting Heun method, 624 
example, 626-627 
fundamental difference, 624, 625 
Nongradient methods, 213 
Nonhomogeneous system, 328 
Nonlinear algebraic equation, 183 
Nonlinear ODEs 
finite-difference methods for, 
663-665 
shooting method for, 656-658 
Nonlinear regression, 345, 395. See also 
General linear least-squares 
regression 
fitting experimentatal data, 397-399 


with MATLAB, 395-396 
techniques, 367 
Nonlinear relationships, linearization 
of, 366 
comments on linear regression, 370 
fitting data with power equation, 
368-370 
nonlinear regression techniques, 
367-368 
power equation, 366-367 
Nonlinear simultaneous equations, 226 
Nonlinear systems, 312. See also Linear 
systems 
Newton-Raphson method, 314-319 
simultaneous nonlinear equations, 
312-313 
successive substitution, 313-314 
Nonperiodic function, 416 
Nonstiff systems, MATLAB functions 
for, 617 
MATLAB to solving system of 
ODEs, 618-620 
ode23 function, 617 
odeset to control integration options, 
620-621 
predator-prey model, 619, 620 
Norm, 293 
in MATLAB, 296-297 
Normal distribution, 351—352 
Normalization, 256 
“Not-a-knot” end condition, 467 
Number system, 106 
Numerical differentiation, 16, 121, 487, 
549-551 
backward difference approximation 
of first derivative, 123 
centered difference approximation of 
first derivative, 123 
derivatives 
and integrals for data with errors, 
558-559 
partial, 559-560 
of unequally spaced data, 
557-558 
diff function, 560-562 
in engineering and science, 551-552 
error analysis, 126-129 
finite-difference approximations 
of derivatives, 124—125 
of higher derivatives, 125 
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Numerical differentiation (Continued) 
free-falling bungee jumper, 548 
gradient function, 563—564 
high-accuracy differentiation 

formulas, 552-555 
with MATLAB, 560 
Richardson extrapolation, 555-557 
roundoff errors in, 127—129 
truncation errors in, 127—129 
visualizations of vector fields, 
565-567 

Numerical errors, 101 
control, 129-130 

Numerical integration, 16 
formulas 

computing work with, 515-518 

free-falling bungee jumper, 
488-489 

higher-order Newton-Cotes 
formulas, 507—508 

multiple integrals, 512-514 


Newton-Cotes formulas, 492-494 


open methods, 512 
Simpson’s rules, 501-507 
trapezoidal rule, 494-501 
of functions 
adaptive quadrature, 537-540 
Gauss quadrature, 530-537 
Romberg integration, 525-530 
root-mean-square current, 
540-543 
Numerical methods, 1, 2, 9, 13, 104, 
136. See also Bisection method 
conservation laws in engineering and 
science, 12—13 
devices and types of balances, 14 
real drag, 17-19 
Nyquist frequency, 419 


& 

Octal, 107 

ODE. See Ordinary differential 
equations (ODE) 

ode113 function, 618 

ode15s function, 632 

ode23 function, 617, 620-621 

ode23s function, 632 

ode23t function, 632 

ode23tb function, 632 

ode45 function, 618, 620, 637, 655 


Ohm’s law, 241, 541, 552 
One-dimensional optimization, 202-211 
fminbnd function, 210-211 
golden-section search, 203—208 
parabolic interpolation, 209-210 
One-point iteration. See Simple 
fixed-point iteration 
One-step methods, 581 
Open integration formulas, 487, 494 
Open methods, 143, 164 
Operation counting, 258-261 
Optimization, 16, 136, 199 
elevation as function of time, 199 
equilibrium and minimum potential 
energy, 213-215 
mathematical perspective, 199 
multidimensional, 202, 211-213 
one-dimensional, 202-211 
optimum analytically by root 
location, 200-201 
single variable function, 200 
Optimum analytically by root location, 
200-201 
optimset function, 182-183 
Ordinary differential equations (ODE), 
16, 573, 574, 647 
dependent variable, 573-574 
first-order ODEs, 574, 576 
initial-value problem, 577 
Runge-Kutta techniques, 577 
shooting method 
for linear ODE, 652—654 
for nonlinear ODEs, 656-658 
solution, 575 
Ordinary frequency, 407 
Oscillations, 447-449 
Overdetermined systems, 238 
Overflow error, 109 
Overrelaxation, 309 


P 
Parabola. See Second-order polynomial 
Parabolic interpolation, 209-210 
Parameters, 5, 6 
passing, 85-87 
Partial derivatives, 550, 559-560 
Partial differential equation (PDE), 
574. See also Ordinary differential 
equations (ODE) 


Partial pivoting, 261 
Passed function, 82 
Passing functions 
anonymous functions, 81—82 
building and implementing function 
function, 84-85 
function functions, 82—83 
to M-files, 81 
Passing parameters, 85 
approach for, 86-87 
Pause command, 76 
pchip. See Piecewise cubic Hermite 
interpolation (pchip) 
PDE. See Partial differential equation 
(PDE) 
Per-step truncation error estimation, 628 
Percent relative error, 102 
Periodic function, 405 
Permutation matrix, 233 
Phase angle, 407, 407 
Phase shift, 407 
Phase-plane plots, 605, 609 
Phasor, 413 
Piecewise cubic Hermite interpolation 
(pchip), 468, 471 
Piecewise cubic spline interpolation, 470 
Piecewise interpolation. See also Multi- 
dimensional interpolation; Polyno- 
mial interpolation 
heat transfer, 476—480 
interp1 function, 470-473 
in MATLAB, 468 
spline function, 468-470 
Pipe friction, 186—190 
Pivot equation, 256 
Pivoting, 261—264 
determinant evaluation with gauss 
elimination, 263—264 
LU factorization with, 280-282 
MATLAB M-file, 262-263 
Pliny’s intermittent fountain, 635-639 
plot3 function, 608 
Point-slope method. See Euler’s method 
polyfit functions, 433 
polynomial regression implemention 
with, 394 
Polynomial interpolation, 430. See also 
Multidimensional interpolation 
examples, 431 
extrapolation, 445-447 
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inverse interpolation, 444-445 
Lagrange interpolating polynomial, 
441-444 
Newton interpolating polynomial, 
433-441 
oscillations, 447—449 
Polynomial(s), 183-186 
MATLAB function, 184-186 
method, 328, 329-331 
regression, 345, 385-389 
implementation, 394 
with MATLAB, 392-393 
polyval functions, 433 
Positional notation, 107 
Posttest loop, 76 
Potential energy, 213 
Power equation, 366 
fitting data with, 368-370 
Power method, 333-334 
for highest eigenvalue, 334—336 
Power spectrum, 423 
with MATLAB, 424 
Preallocation of memory, 74-75 
Precision, 100-101, 111—112 
Predator-prey 
equations, 618 
models and chaos, 604—609 
Predictor equation, 588 
Predictor-corrector approach, 589 
Pretest loop, 75 
Primary function, 61 
Problem solving, | 
real drag, 17-19 
Programming with MATLAB 
bungee jumper velocity, 87-90 
input-output, 61—65 
M-files, 54 
nesting and indentation, 79-81 
passing functions to M-files, 81-87 
structured programming, 65-78 
Propagated truncation error, 583-584 
Proportionality, 291 


Q 
QR factorization, 394 
quadadapt function, 537-539 
Quadratic 
convergence, 170 
interpolation, 435-437 


polynomial. See Second-order 
polynomial 
splines, 459-462 
Quadrature, 489 


R 
Ralston’s method, 595 
Random numbers, 353 
generation, 345 
rand function, 354-356 
randn function, 356-358 
Range, 111, 349 
Rate equations, 573 
Regression, 16 
Relative tolerance (RelTol), 618 
Relaxation, 309-310 
Gauss-Seidel method with, 311-312 
RelTol. See Relative tolerance (RelTol) 
Residual, 358 
Reverse-wrap-around order, 
421, 423 
Reynolds number, 17 
Richardson extrapolation, 524, 525-526, 
555-557 
example, 526-527 
higher-order corrections, 527 
RK methods. See Runge-Kutta methods 
(RK methods) 
RK-Fehlberg methods, 617 
rk4sys function, 602—603, 605 
Romberg integration, 487, 508, 524, 
525, 528. See also Adaptive quadra- 
ture; Gauss quadrature 
algorithm, 527-530 
M-file to implement, 530 
Richardson extrapolation, 525-527 
Roots, 135, 164, 165, 184-186 
bisection method, 146-152 
Brent’s method, 176-181 
in engineering and science, 139-140 
false position, 152-155 
graphical methods, 140-141 
greenhouse gases and rainwater, 
156-159 
MATLAB function, 181-183 
Newton-Raphson method, 169-174 
pipe friction, 186-190 
polynomials, 183-186 
root-location techniques, 224 
root-mean-square current, 540-543 


Secant methods, 174-176 
simple fixed-point iteration, 165-169 
roots function, 184-186 
round function, 40 
Roundoff errors, 3, 106, 583. See also 
Truncation errors 
arithmetic manipulations of computer 
numbers, 112-114 
computer number representation, 
106-112 
in numerical differentiation, 
127-129 
Row, 229 
row-sum norm, 294 
vectors, 229 
Runge-Kutta methods (RK methods), 
577, 578, 581, 587, 593. See also 
Adaptive Runge-Kutta methods; 
Euler’s method 
classical fourth-order, 595-598 
increment function, 593 
rk4sys function, 602-603 
second-order, 594-595 
systems of equations, 600-601, 
603-604 
types of, 594 
Runge’s function, 447, 468 
comparison, 448-449, 469-470 


> 
S-shape, 374 
Saturation-growth-rate equation, 367 
Scalars, 29-30 
Script files, 54—55 
Secant methods, 174-176. See also 
Brent’s method 
Second finite divided difference, 437 
Second forward finite difference, 125 
Second-order equation, 574 
Second-order polynomial, 435 
Second-order Runge-Kutta methods, 
594-595 
Sequential search, 458 
Shooting method, 651 
with derivative boundary conditions, 
654—656 
for linear ODE, 652—654 
for nonlinear ODEs, 656—658 
trial-and-error approach, 651 
Sigmoid, 374 
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Signed magnitude method, 107 
Signum function, 580 
Simple fixed-point iteration, 165-169, 
313-314 
Simpson’s rules, 501, 529 
Simpson’s 1/3 rule, 486, 501-502, 
535, 541 
composite Simpson’s 1/3 rule, 
503-505 
single application, 502-503 
Simpson’s 3/8 rule, 486, 505-507, 
535 
Simulation, 353-358 
MATLAB function, 353-358 
Simulink®, 677-684 
Simultaneous equations, determin- 
ing polynomial coefficients with, 
431-432 
single-line if structure, 66 
Singular value decomposition, 394 
Sinusoid, 405 
least-squares fit of, 408-411 
sinusoidal functions, 405 
curve fitting with, 405—411 
Smearing, 114 
SOR. See Successive overrelaxation 
(SOR) 
Spectral norm, 294 
Splines, 453 
cubic, 462—468 
drafting technique, 455 
functions, 453 
heat transfer, 476—480 
to higher-order interpolating 
polynomials, 454 
interpolation, 345 
linear, 455-459 
in MATLAB, 468-470 
quadratic, 459-462 
spline function, 468-470 
Square matrices, 230 
Standard deviation, 349 
Steady-state calculation, 12 
Stefan-Boltzmann law, 59 
Step halving approach, 616 
Stiff system, 628 
Stiffness, 628 
Euler’s method, 629-630 
explicit and implicit Euler method, 
630-632 


MATLAB application, 634-635 
MATLAB functions for, 632—634 
Pliny’s intermittent fountain, 
635-639 
stiff solution of single ODE, 629 
Stimulus-response computations, 
290-291 
Stokes drag, 17 
Stopping criterion, 103 
String functions, 35 
Structured programming, 65 
animation, 77—78 
decisions, 65—72 
loops, 72-76 
Subfunctions, 60-61 
Subtractive cancellation, 113 
Successive overrelaxation (SOR), 309 
Successive substitution. See Simple 
fixed-point iteration 
sum function, 40 
Sunspots, 425—426 
Superposition, 291 
Swamee-Jain equation, 187, 189 
Switch structure, 70—71 
Symmetric matrix, 230 
Systems of equations, 598 
Euler’s method, 598—600 
Runge-Kutta methods, 600-604 


T 
Table lookup, 458—459 
Taylor series, 114-118, 552-553 
to estimating truncation errors, 
120-121 
expansion, 118, 314, 584 
remainder for, 119—120 
Taylor theorem, 114, 117 
Terminal velocity, 9 
Thermocline, 476 
Third-order polynomial, 462 
Three-point Gauss-Legendre formula, 
536-537 
Time domain, 414—415 
Time plane, 415 
Time series, 405 
Time-consuming evaluation, 208 
Time-variable, 12 
Top-down design, 79 
Torricelli’s law, 635—636 


Total numerical error, 125 
control of numerical errors, 
129-130 
error analysis of numerical differen- 
tiation, 126-129 
trade-off between roundoff and 
truncation error, 126 
Transient variable, 12 
Transpose, 233, 234, 283 
Transposition matrix, 233 
trap function, 499-501 
Trapezoidal rule, 486, 494, 495, 529, 
531, 541, 592, 593 
composite trapezoidal rule, 497-499 
error of, 495 
integral evaluation by, 532 
MATLAB M-file function, trap, 
499-501 
single application of, 495—497 
with unequal segments, 508 
trapuneq function, 509 
trapz functions, 510 
Trend analysis, 344 
Trial and error technique, 135, 651 
Tridiag function, 266 
Tridiagonal matrices, 225, 231 
Tridiagonal systems, 264 
MATLAB M.-file, 266 
solution of, 265 
Truncation errors, 3, 114, 583-584. 
See also Roundoff errors 
numerical differentiation, 121-125 
in numerical differentiation, 127—129 
remainder for Taylor series 
expansion, 119-120 
Taylor series, 114-118 
Taylor series to estimating, 120-121 
Two-dimensional interpolation, 473 
implementing by first applying 
one-dimensional linear 
interpolation, 474 
Two-point Gauss-Legendre formula, 
533-536, 543 
weighting factors and function 
arguments, 536 
2s complement technique, 108 


U 


Uncertainty. See Imprecision 
Unconditionally stable approach, 630 
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Underdetermined systems, 238 

Underrelaxation, 309 

Undetermined coefficients method, 
531-533 

Unequally spaced data, derivatives of, 
557-558 

Unimodal, 204 

Upper triangular matrix, 230, 280 

Upper triangular system, 256 


Vv 
van der Pol equation, 632-633 
Vandermonde matrices, 432 


Variable argument list, 71-72 
Variable scope, 57-58 
Variance, 349 
Vectorization, 74 
Vectors, 31-32, 293-294 
vector fields, visualizations of, 
565-567 
vector-matrix multiplication, 38 
Visualization 
two-dimensional function, 212 
vector fields, 565-567 
Voltage rules, 240, 241 
Voltages in circuits, 240-243 
Volume integral, 492 


W 

while structure, 75 

while... break structure, 75—76 
Wolf sunspot number, 425 
Word, 106 


X 


xmin function, 213 


Z 


Zeros 
of equation, 135 
zero-order approximation, 115 


